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A method for estimation of cancer risk using sequence polymorphisms in a 
specific region of chromosome 19 



subjects with an increased risk of having or developing cancer. In particular, this 
invention relates to the identification and characterization of polymorphisms in the 
human chromosome 19q, the region r located approximately 19q 13.2-3 correlated 
with Increased risk of developing cancer and the responsiveness of a subject to 
various treatments for cancer. 

Background 

DNA polymorphisms provide an efficient way to study the association of genes and 
diseases by analysis of linkage and linkage disequilibrurn. With the sequencing of 
the human genome a myriad of hitherto unknown genetic polymorphisms among 
people have been detected. Most common among these are the single nucleotide 
polymorphisms, also called SNPs, of which we now know several millions. Other 
examples are variable number of tandem repeat polymorphisms, insertions, dele- 
tions and block modifications. Tandem repeats often have multiple different alleles 
(variants), whereas the other groups of polymorphisms usually just have two alleles. 
Some of these genetic polymorphisms probably play a direct role in the biology of 
the individuals, including their risk of developing disease, but the virtue of the major- 
ity is that they can serve as markers for the surrounding DNA, and thus serve as 
leads during as search for a causative gene polymorphism, as substitutes In the 
evaluation of its role in health and disease, and as substitutes in the evaluation of 
the genetic constitution of individuals. 

The association of an allele of one sequence polymorphism with particular alleles of 
other sequence polymorphisms in the surrounding DNA has two origins, known In 
the genetic field as linkage and linkage disequilibrium, respectively. Linkage arises 
because large parts of chromosomes are passed unchanged from parents to off- 
spring, so that minor regions of a chromosome tend to flow unchanged from one 
generation to the next and also to be similar in different branches of the same fam- 
ily. Linkage is gradually eroded by recombination occurring in the cells of the germ- 



The present invention provides methods and compositions for identifying human 
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line, but typically operates over multiple generations and distances of a number of 
million bases in the DNA. 

Linkage disequilibrium deals with whole populations and has its origin in the (distant) 
5 forefather in whose DNA a new sequence polymorphism arose. The immediate sur- 
roundings in the DNA of the forefather will tend to stay with the new allele for many 
generations. Recombination and changes in the composition of the population will 
again erode the association, but the new allele and the alleles of any other polymor- 
phism nearby will often be partly associated among unrelated humans even today. A 

10 crude estimate suggests that alleles of sequence polymorphisms with distances less 
that 10000 basestn the DNA will have tended to stay together since modem man 
arose. Linkage disequilbrium in limited populations, for instance Europeans, often 
extends over longer distances. This can be the result of newer mutations, but can 
also be a consequence of one or more "bottlenecks' with small population sizes and 

1 5 considerable inbreeding in the history of the current population. Two obvious possi- 
bilities for "bottlenecks" in Europeans are the exodus from Africa and the repopula- 
tion of Europe after the last ice age. 

Linkage disequilibrium is the results of many stochastic events and as such subject 
20 to statistical variation occasionally resulting in discontinuities, lack of a monotonic 
relationship between association and distance and differences between people of 
different ethnicity. Therefore, It Is often advantageous to study more that one se- 
quence polymorphism In a given region. This also allows for further definition of the 
genetic surroundings of the biologically relevant polymorphism by combining the 
25 associated alleles of the different markers into a socalled haplotype. 

Humans In general cany two copies of each human chromosomB in each cell. There 
are exceptions to this rule, not relevant to this application. We therefore speak about 
genotypes i.e. the combined analysis of both chromosomes at a given sequence 
30 polymorphism. The resulting genotypes of a person, analysed for instance on DNA 
from peripheral blood leukocytes, are inherently very stable over time. Therefore, 
this type of analysis can be performed any time In the Bfe of a person and will be 
applicable to this person for his or her entire life. By the same token such genetic 
analyses are ideally suited to predict future risks of disease 

35 
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A variety of Investigations suggest that many diseases in part are determined by the 
genetic constitution of the individual. One group of genes In particular have been 
associated with rare genetic predispositions to cancer These are the genes in- 
volved in maintaining the integrity of a persons DNA, the so-called DMA repair 
5 genes. One set of such genes are the XP genes which participate in nucleotide ex- 
cision repair, and, when mutated, give rise to a 1000 fold increased risk of getting 
skin cancer. For this reason we have previously investigated single nucleotide poly- 
morphisms in one DNA repair gene XPD for association with risk of skin cancer in a 
cohort of Caucasian Americans, and found that one allele of the sequence poiymor- 
10 phism called XPDe6 was associated with a moderately increased risk of getting ba- 
sal cell carcinoma, the most oommon form of skin cancer. Later other groups have 
studied the association between sequence polymorphisms in this and other DNA 
repair genes and various forms of cancer. Some have reported positive results. 

15 Very little is known about the function of the gene RAI. It was cloned because its 
protein product binds to and inhibits RelA of the transcription regulator NF-kappaB. 

Summary of the invention 

20 The present invention relates in a first aspect to a group of nucleic acid sequenoes 
found to be associated with canoer. The invention further relates to transcriptional 
and transitional products of said sequence. An allele in the r region can be identi- 
fied as correlated with an Increased risk of developing cancer, the prognosis of de- 
veloped cancer, and responsiveness to cancer treatment on the basis of statistical 

25 analyses of the incidence of a particular allele in individuate diagnosed with cancer. 

Thus, in a first aspect the invention relates to a method for estimating the cancer risk 
of an individual comprising 

30 - providing a sample from said individual, 

- assessing in the genetic material Including humar> genes in said sample a se- 
quence polymorphism 

35 - in a region corresponding to SEQ ID NO: 1 , or a part thereof, or 
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- in a region complementary to SEQ ID NO: 1 , or a part thereof, or 

- in a transcription product from a sequence in a region corresponding to SEQ 
ID NO: 1. or a part thereof, or 

- or translation product from a sequence in a region corresponding to SEQ ID 
NO: 1, or a part thereof, 

- obtaining a sequence polymorphism response. 

- estimating the cancer risk of said individual based on the sequence polymor- 
phism response. 

The estimation of the cancer risk of an individual can involve the comparison of the 
number and/or kind of polymorphic sequences identified with a predetermined can- 
cer risk profile. Such a profile can be based on statistical data obtained for a rele- 
vant reference group of individuals. 

The sequence of the r region Is set forth as SEQ ID NO 1, originating from the clon- 
ing of human chromosome 19q published as part of the contig NTJ>1 1 109 in the 
database of human sequences established by National Center for Biotechnology 
Information and located on the internet at 
hfc tp : / /www . ncbi . nlm .nih. gov/qanoirtQ /guide /human/ 

The presence of an aliele is determined by determining the nucleic acid sequence of 
all or part of the region according to standard molecular biology protocols well 
known in the art as described for example in Sambrook et ah (1989) and as set forth 
in the Examples provided herein or products of the nucleic acid sequences. 

In particular, the nucleic acid molecules of the present invention represent In a first 
aspect nucleic acid sequences forming part of the region r corresponding to position 
1522-37752 of SEQ ID NO: 1, and preferably to certain nucleic acid sequences 
within the gene referred to herein as RAI. As demonstrated in the Examples pre- 
sented below, the RAI gene is in particular associated with human cancer diseases. 

Furthermore, the invention relates to a method for estimating the cancer prognosis 
of an Individual comprising 
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- providing a sample from said individual, 



- assessing In the genetic material including human genes fn said sample a se- 
quence polymorphism 

- in a region corresponding to SEQ ID NO: 1 , or a part thereof, or 

- In a region complementary to SEQ ID NO: 1 , or a part thereof, or 

- in a transcnptlon product from a sequence in a region corresponding to SEQ 
ID NO: 1, or a part thereof, or 

• or translation product from a sequence In a region coiresponding to SEQ ID 
NO: 1 , or a part thereof. 

- obtaining a sequence polymorphism response, 

- estimating the cancer prognosis of said individual based on the sequence poly- 
1 5 morphism response. 

The estimation of the cancer prognosis of an individual can involve the comparison 
of the number andfor kind of polymorphic sequences identified with a predetermined 
cancer prognosis profile. Such a profile can be based on statistical data obtained for 
20 a relevant reference group of individuals. 

Additionally provided is a method of identifying a human subject as having an In- 
creased likelihood of responding to a treatment, comprising a) correlating the pres- 
ence of a r region allele genotype with an increased likelihood of responding to 
25 treatment: and b) determining the r region allele genotype of the subject, whereby a 
subject having a r region allele genotype correlated with an increased likelihood of 
responding to treatment is identified as having an increased likelihood of responding 
to treatment 
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30 Thus, the present invention also relates to method for estimating a treatment re- 
sponse of an individual suffering from cancer to a cancer treatment, comprising 

- providing a sample from said individual, 
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- assessing in the genetic material including human genes In said sample a se- 
quence polymorphism 

- . in a region corresponding to SEQ ID NO; 1 t or a part thereof, or 
5 - In a region complementary to SEQ ID NO: 1 , or a part thereof, or 

- in a transcription product from a sequence in a region corresponding to SEQ 
ID NO; 1. or a part thereof, or 

- or translation product from a sequence in a region corresponding to SEQ ID 
NO: 1 f or a part thereof, 

1 0 - obtaining a sequence polymorphism response. 

- estimating the Individual's response to the cancer treatment based on the se- 
quence polymorphism response. 

15 The estimation^ the Individuals response to cancer treatment can involve the com- 
parison of the number and/or kind of polymorphic sequences identified with a pre- 
determined cancer treatment response profile. Such a profile can be based on sta- 
tistical data obtained for a relevant reference group of individuals. 

20 The invention also comprises primers or probes for use in the invention, as well as 
kits including these. The primers and/or probes are preferably capabfe of hybridising 
to SEQ ID NO:1 , or a part thereor, in particularly the r region, or a part thereof, un- 
der stringent conditions. 

25 Furthermore, the invention also relates to cloning vectors and expression vectors 
containing the nucleic acid molecules of the invention, as well as hosts which have 
been transformed with such nucleic acid molecules, including cells genetically engi- 
neered to contain the nucleic acid molecules of the invention, and/or cells geneti- 
cally engineered to express the nucleic acid molecules of the invention. The nucleic 

30 acids are preferably isolated form the r region and preferably contain one or more 
sequence polymorphisms as described herein below in more detail. In addition to 
host cells and cell lines, hosts also include transgenic non-human animals (or prog- 
eny thereof). 
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In particular, the present Invention is based on the discovery of the correlation with 
single nucleotide polymorphisms (SNPs) and/or tandem repeats in the r region and 
cancer. Thus, SNPs have been found in the r region as shown in table 1. However, 
the present Invention is not limited to the SNPs shown in table 1 , but does include 
any SNP in the region. Taridem repeats have been found in the r region as shown in 
table 2. However, the present invention is not limited to the tandem repeats shown 
in table 2. but does include any tandem repeat in the region. 

The term human includes both a human having or suspected of having a cancer 
disease and an a-symptomatic human who may be tested for predisposition or sus- 
ceptibility to cancer. At each position the human may be homozygous for an allele or 
the human may be a heterozygote. 

Drawings 

Fig. 1 shows a subregion of chromosome 19q 

Fig. 2 shows odds ratios and p-values for Individual sequence variations in relation 
to risk of basal cell carcinoma 

Fig. 3 shows odds p-values for association of different sequence variations with risk 
of basal cell carcinoma among psoriatic Danes 

Detailed description of the invention 

The present invention relates to a characterization of a person's present and/or fu- 
ture risk of getting certain forms of cancer. The characterization is based on the 
analysis of sequence polymorphisms in a region of chromosome 19q in the person. 

A number of polymorphisms in the chromosomal region !Gq13.2-3 have been iden- 
tified and characterised. Surprisingly, the sequence polymorphisms with strongest 
association to disease appeared to be located outside XPD. More specifically, the . 
sequences were located in a sub-region between XPD and ERCC1, and seemed to 
have a maximum in or around the gene RAI (See Exampte 1). For persons getting 
their skin cancer relatively early (before 50 years of age), it was found that predic- 
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lions got better (Example 2) and when two sequence polymorphisms In RAJ were 
combined, the prediction of early skin cancer got even better (Example 3). It was 
also possible to combine sequence polymorphisms in RA1 with sequence polymor- 
phisms outside the region and get highly positive results (Example 4). 

The region of chromosome 1 9q, more precisely the region located in 19q1 3.2-3. with 
which the present invention is concerned, is depicted in Figure 1 as it is presently 
known together with the presently known or suspected genes. The arrows indicate 
the directions of transcription of the genes. The absolute chromosome positions 
shown are from the particular build of NCBI's map of chromsome 19 f and will proba- 
bly change with time. 

The region r stretches from the beginning of, but not includingf the XPD gene, to 
* approximately the end of ERCC1 and Includes the genes RAI, LOC1 62978, and 
ASE-1. More specifically r is bounded by and includes the following two sequences: 
AGAACCCCCQ CCCCTCCACC TCGTCTCAAA and TCCCTCCCCA GA- 
GACTGCAC CAGCGCAGCC, and is defined by SEQ ID NO: 1. 

In the present context the region r means SEQ ID NO: 1 and complementary se- 
quence as well as transcriptional products and translations products thereof. 

One preferred section of the region r stretches approximately from the end of RAI to 
the beginning of ASE-1 and includes the genes RAI, LOC162978. and ASE-1. More 
specifically, this seclton of r Is bounded by and Includes the following sequences: 
GAAGTGAGCC AAGATCACGC CACTGCACTC and GTGCCCACCT GGGCCAC- 
CAG AAGGTGACAC. In the present context the region r means SEQ ID NO: 1 
bases 1522-37752 and complementary sequence as well as transcriptional products 
and translations! products thereof. 

Finally, in the claims the gene RAI is defined as including transcribed sequences of 
the gene plus a. 1500 base upstream promoter region. More specifically RAI is 
bounded by and includes the following sequences: CATAACCACA ATGATGAGCA 
TGTATTGAGT and ATGTTGTCCA GGCTGGTCTT GAACTCCTGA. In the present 
context this section of the region r relates to SEQ ID NO: 1 bases 7761-22885 and 
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complementary sequence as well as transcriptional products and translational prod- 
ucts thereof. 

Modifications to the human genome map are known to occur from time to time. It is 
therefore possible that the defining sequences quoted above will change slightly in 
future maps. 

Fragments or parts of the region r as used herein relates to any fragment of at least 
100 nucleic acid redues in length, or mutiples of 100 nucleic acid residues in length, 
starting from SEQ ID NO: 1 position 1, 100, 200, 300, 400, 500, 600. 700, 800. 000. 
1000, 1100, 1200, 1300. 1400, 1500, 1600. 1700. 1800, 1900, 2000. 2100. 2200. 
2300. 2400, 2500, 2600, 2600, 2700, 2800, 2900, 3000, and so forth, each fragment 
starting position having an increment of 100 nucleic acid residues. Multiples are 
preferably multiples of e.g. 1, 2, 3, 4, 5, 6. 7, 8, 9. 10, 11. 12, 13, 14, 15. 16. 17, 18. 
19, 20. 21, 22, 23. 24. 25, 26, 27, 28, 29, 30. 31, 32, 33. 34, 35. 36, 37, 38, 39, 40, 
41 , 42, 43, 44. 45, 46, 47, 48, 49 and 50. 

For fragments starting at position 1, the length of said fragments will thus be e.g. 
100, 200. 3Q0./400. 500. 600, 700. 800. 900, 1000, 1100, 1200, 1300, 1400. 1500, 
1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600. 2600, 2700, 
2800, 2900. 3000, and so forth, using suitable multipllcatore as listed herein above. 

For fragments starting at position 100, the length of said fragments will thus be e.g. 
100, 100, 300, 400, 500. 600. 700, 800. 900, 1000, 1100, 1200, 1300, 1400, 1500. 
1600, 1700, 1800, 1900. 2000, 2100. 2200, 2300. 2400, 2500. 2600. 2600, 2700, 
2800, 2900, 3000. and so forth, using suitable multiplicatore as listed herein above. 

For fragments starting at position 7700, the length of said fragments will thus be e.g. 
100. 200, 300, 400. 500, 600, 700, 800. 900, 1000. 1100, 1200. 1300, 1400. 1500, 
1600. 1700, 1800. 1900, 2000. 2100, 2200, 2300, 2400. 2500, 2600, 2600. 2700. 
2800. 2900. 3000, 3500, 4000, 4500, 5000, 5500. 6000, 6500, 7000. 7500. 8000, 
8500, 9000, 9500, 10000, 10500, 11000. 11500. 12000, 12500, 13000, 13500, 
14000, 14500, 15000. and so forth, using suitable multiplicators.such as e.g. the 
ones listed herein above. 
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The nucleic add sequences according to the present invention makes it possible to 
estimate cancer risk in an Individual by U9ing sequence polymorphisms originating 
from a specific region of chromosome 19. 

Estimation of cancer risks has a number of important applications: 

(1) Individuals with reasons to suspect that they are at risk for getting cancer would 
be able to clarify their situation and, If possible, take protective action. Alternatively, 
anti-cancer campaigns, companies, hospitals or other institutions could offer a serv- 
ice to help people clarify their situation. It would for instance be possible to test per- 
sons, when they got their first basal cell carcinoma, which is often recurrent and also 
is a moderate predictor for other cancers, if the pereons were in a high-risk group, 
one could then advice them about, or they could of their own accord choose, risk- 
reducing behaviour, such as avoidance of excessive sun-exposure, abstaining Itom 
smoking etc. About 5 percent of the Danish population will at some point in their life 
get a basal cell carcinoma. 

(2) Anti-cancer campaigns, companies, hospitals or other Institutions woufd be able 
to define relevant target subpopulations and focus information on risk-reducihg be- 
haviour on these persons. They might perhaps also be in a position to inform the 
remainder of the population that they need not wony. Lung cancer affects approxi- 
mately 10-15 percent of smokers and thus approximately 5 percent of the popula- 
tion, somewhat varying from country to country. Malignant melanoma, a sun- 
induced, often lethal form of skin cancer, affects approximately 700 persons a year 
in Denmark or about 1 percent of the Danish population. 



(3) The drugs used in cancer treatment are often carcinogenic themselves and indi- 
vidual responses to tham vary considerably, both with respect to tolerance to the 
treatment and with respect to efficacy of the treatment It is an obvfous possibility 
that the region of chromosome 19 here dealt with, which contains DMA repair genes 
known to modulate carcinogen responses, also modulates response to anti-cancer 
agents. Hence, analysis of the region may facilitate better choices of treatment for 
cancer, and/or help predict the future course of disease. 
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By sequence polymorphism Is understood any single nucleotide, tandem repeat 
insert, deletion or block polymorphism, which varies among humans, whether it is of 
biological Importance or not 

Position of sequence polymorphism in the region r 



In one embodiment of the methods of the invention, preferably the method for diag- 
nosis as described herein, one or more single nucleotide polymorphism^) at a pre- 
1 0 determined position in the region r'(SEQ ID NO: 1) are identified and used for e.g. 
cancer risk profiling and/or cancer treatment response profiling. Presently preferred 
single nucleotide polymorphism^) are listed in Table 1 . However, the present inven- 
tion relates to any SNP in the r region. 



15 



Table 1 



Identification tn dbSNP* 

rs#31 38378 A/G 

rs#31 38376 G/T 
20 rs#209725 C/A ambigouous location 

rs#2377328 C/T 

rs#6966 A/T 

is#2017154 AJC 

rs#2017104 A/G 
25 rs#2070830 T/G 

rs#1 970764 A/G 

rs#2226949 T/G 

rs#959457 C/T 

rs#2336218 C/A 
30 rs#766934 A/G 

rs#928911 C/T ' 

rs#1005165 C/T 

rs#1 005166 C/T 

r*#967591 A/G 
35 rs#1 046282 T/C 



Position in SEQ ID NO: 1 

137 

235 

7199 

7887 (=RAIefi) 
12115 
12190 
14575 

15798 (=RAH1) 

32035 

32446 

32447 

32481 

327B5 

33974 

34119 

34858(=ASE-1e1J 
35596 
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r$#20 13521 A/T 38254 
rs#735482 A/C 36926 
rs#762562 A/G 37287 
rs#2336919 ambiguous location 

rs#743571 C/Q 37788 



1 dbSNP Is the database over SNPs established by the National Center for Biotech- 
nology Information and located on the Internet at httD://www,ncbi.nlm.nih.oov/SNP/ . 

f n another embodiment of the invention preferably the method for described herein 
ts one in which the tandem repeat is at a position as described in Table 2: 

Table 2 

15 

identification in uniSTS 2 

D19S908 

STS-W67936 
20 D19S543 

D19S393 

STS-R48186 

GDB:181915 

RH47033 
25 • GDB:190019 

2 UniSTS is a database of unique sequence tag sites established by National Center 
for Biotechnology Information and located on the internet at 

http.*//wwiy.ncbi ,nlm.n3Lh>flov/entrQz/guery, fcqi?<3b»unigt:a 

30 In another embodiment of the invention preferably the method for diagnosis de- 
scribed herein Is one In which the sequence polymorphism is in region r. Testing for 
the presence of the RAJ gene allefe Is especially preferred because, without wishing 
to be bound by theoretical considerations, of Its association with Increased risk of 
cancer (as explained herein). 

. 35 
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The sequence polymorphism of the invention comprises at least one base differ- 
ence, such as at (east two base differences. As described above the sequence 
polymorphism comprises at least one single nucleotide polymorphism, such as at 
least two single nucleotide polymorphisms. Also, the sequence polymorphism com- 
prises at least one tandem repeat polymorphism, such as at least two tandem re- 
peat polymorphisms. 

Also, the sequence polymorphism may be a combination of single nucleotide poly- 
morphism and tandem repeats. 

The status of the individual may be determined by reference to allelic variation at 
one, two, three, four or more of the above loci. 

Ceil sample 



The cell sample used in the present invention may be any suitable cell sample ca- 
pable of providing the genetic material for use in the method. In a preferred em- ■ 
bodimentthe cell sample is a blood sample, a tissue sample, a sample of secretion, 
semen, ovum, a washing of a body surface (e.g. a buccal swap), a clipping of a • 
20 body surface (hairs, or nails), such as wherein the cell is selected from white blood 
cells and tumor tissue. 

.It will be appreciated that the test sample may equally be a nuc&eic acid sequence 
corresponding to the sequence in the test sample, that is to say that all or a part of 
25 the region in the sample nucleic acid may firstly be amplified using any convenient 
technique e.g. PCR, before use in the analysis of variation in the region. 



30 Detection methods 

Detection may be conducted on the sequence of SEQ ID NO: 1 or a complementary 
sequence as well as on translational (mRNA) and transcriptional products (polypep- 
tides, proteins) therefrom. 
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It win be apparent to the person skilled in the art that there are a large number of 
analytical procedures which may be used to detect the presence or absence of vari- 
ant nucleotides at one or more of positions mentioned herein in the r region. Muta- 
tions or polymorphisms within or flanking the rreolon can be detected by utilizing a 
5 number of techniques. Nucleic acid from any nucleated eel] can be used as the 

starting point for such assay techniques, and may be isolated according to standard 
nucleic acid preparation procedures that are well known to those of skill in the art In 
general, the detection of allelic variation requires a mutation discrimination tech- 
nique, optionally an amplification reaction and a signal generation system. Table 3 

10 lists a number of mutation detection techniques, some based on the PCR. These 

may be used in combination with a number of signal generation systems, a selection 
of which is listed In Table 4. Further amplification techniques are listed in Table 5. 
Many current methods for the detection of allelic variation are reviewed by Kfollau et 
al. a Clin. Chem. 43, 1114-1120, 1997: and in standard textbooks, for example "Labo- 

15 ratory Protocols for Mutation Detection", Ed. by U. Landegren, Oxford University 
Press, 1996 and "PGR", 2.sup.nd Edition by Newton & Graham, BIOS Scientific 
Publishers Limited, 1997. 

Table 3 

20 



30 



35 



Abbreviations: 




ALEX TM. 


Amplification refractory mutation system linear extension 


APEX 


Arrayed primer extension 


ARMS TM. 


Amplification refractory mutation system 


b-DNA 


Branched DNA 


CMC 


Chemical mismatch cleavage 


bp 


base pair 


COPS 


Competitive oligonucleotide priming system 


DGOE 


Denaturing gradient gel electrophoresis 


FRET 


Fluorescence resonance energy transfer 


LCR 


Ugase chain reaction 


MASDA 


Multiple allele specific diagnostic assay 


NASBA 


Nucleic add sequence based amplification 


OLA 


Oligonucleotide ligation assay 


PCR 


Polymerase chain reaction 
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PTT Protein truncation test 

RFLP Restriction fragment length polymorphism 

SDA Strand displacement amplification 

SNP Single nucleotide polymorphism 

SSCP Single-strand conformation polymorphism analysis 

SSR Self sustained replication 

TGGE Temperature gradient gel electrophoresis 



1 0 Table 4 illustrates various mutation detection techniques capable of being used for 
SNP detection. 

Table 4 

15 General techniques: DNA sequencing. Sequencing by hybridisation, SMAPshot 

Scanning techniques: PJT*. SiSCP, DOGE, TGGE, Cleavase, Heteroduplex analy- 
sis, CMC, Enzymatic mismatch cleavage 

20 Hybridisation Based techniques 

Solid phase hybridisation; Dot blots, MASDA. Reverse dot blots. Oligonucleotide 
arrays (DNA Chips) 

25 Solution phase hybridisation: Taqman.TM.-U.S. Pat. No. 5,210,015 & 5,487,972 
(Hoffmann-La Roche), Molecular Beacons-Tyagi et al (1996), Nature Biotechnol- 
ogy, 14, 303: WO 95/13399 (Public Health Inst, New York), Lightcycler, optionally In 
combination with FRET. 

30 Extension Based: ARMS.TM., ALEX. TM.— European Patent No. EP 332435 B1 
(Zeneca Limited), COPS-Gibbs et al (1989), Nucfeic Acids Research, 17, 2347. 

Incorporation Based: Mini-sequencing, APEX 

35 Restriction Enzyme Based: RFLP, Restriction site generating PCR 
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Ligation Based: OLA 
Other: Invader assay 

Various Signal Generation or Detection Systems is listed below: 

Fluorescence: FRET. Fluorescence quenching, Fluorescence polarisation-United 
Kingdom Patent No. 2228998 (Zteneca Limited) 

Other: Chemiluminescence. Electrachemifuminesoence, Raman, Radioactivity, Col- 
orfmetric. Hybridisation protection assay, Mass spectrometry 



15 Table 5 illustrates examples of further amplification techniques. 
Table 5 

SSR, NASBA, LCR, SDA, b-DNA 

20' 

Prefenred mutation detection techniques include ARMS.TM., ALEX.TM., COPS, 
Taqman, Molecular Beacons, RFLP r and restriction site based PCR and FRET 
techniques. 

25 Particularly preferred methods Include FRET; taqman, ARMS.TM. and RFLP based 
methods. 

In a preferred embodiment, mutations or polymorphisms can be detected by using a 
microassay of nucleic acid sequences immobilized to a substrate or "gene chip" 
30 (see. e.g. Cronin, et al., 1996, Human Mutation 7:244-255). 

• Further, improved methods for analyzing .DNA polymorphisms, which can be utilized 
for the Identification of region r specific mutations, have been described that capital- 
ize on the presence of variable numbers of short, tandemly repeated OMA se- 
35 quences between the restriction enzyme sites. For example, Weber (U.S, Pat No. 
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5,075,217) describes a ONA marker based on length polymorphisms In blocks of 
(dC-dA)rv{dG-dT)n short tandem repeats. The average separation of (dOdA)n-(dG- 
dT)n blocks fs estimated to be 30,000-60,000 bp. Markers that am so closely 
spaced exhibit a high frequency co-Inheritance, and are extremely useful in the 
5 identification of genetic mutations, such as, for example, mutations within the RAI 
gene, and the diagnosis of diseases and disorders related to RAI mutations. 

Also, Caskey et al. (U.S. Pat No. 5.364,759) describe a DNA profiling assay for 
detecting shtirt tri and tetra nucleotide repeat sequences. The process includes ex- 
10 tracting the DNA of interest, such as the RAI gene, amplifying the extracted DNA. 

and labelling the repeat sequences to form a genotypic map of the individual's ONA. 

The level of RAI gene expression can also be assayed. For example, RNA from a 
cell type or tissue known, or suspected, to express the RAI gene, such as brain, 

15 may be isolated and tested utilizing hybridization or PCR techniques such as are 
described, above. The isolated cells can be derived from cell culture or from a pa- 
tient The ana\ys\s of cells taken from culture may be a necessary step in the as- 
sessment of cells to be used as part of a cell-based gene therapy technique or, al- 
ternatively, to test the effect of compounds on the expression of the RAI gene: Such 

20 analyses may reveal both quantitative and qualitative aspects of the expression 

pattern of the RAI gene, including activation or inactivation of RAI gene expression. 
In one embodiment of such a detection scheme, a eDNA molecule is synthesized 
from an RNA molecule of interest (e.g., by reverse transcription of the RNA mole- 
cule into cDNA). A sequence within the cDNA is then used as the template for a 

25 nucleic acid amplification reaction, such as a PCR amplification reaction, or the like. 
The nucleic acid reagents used as synthesis initiation reagents (e.g., primers) in the 
reverse transcription and nucleic acid amplification steps of this method are chosen 
from among the RAI gene nucleic acid reagents described above. The preferred 
lengths of such nucleic acid reagents are at least 9-30 nucleotides. For detection of 

30 the amplified product, the nucleic acid amplification may be performed using radio- 
actTvely or non-radioactively labeled nucleotides. Alternatively, enough amplified 
product may be made such that the product may be visualized by standard ethidium 
bromide staining or by utilizing any other suitable nucleic add staining method. 



» 
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Additionally, It is possible to perform such RAI gene expression assays ''in situ", i.e.. 
directly upon tissue sections (fixed and/or frozen) of patient tissue obtained from 
biopsies or resections, such that no nucleic add purification is necessary. Nucleic 
acid reagents such as those described above may be used as probes and/or prim- 
5 ers for such in situ procedures (see, for example, Nuovo. G. J., 1992, "PCR Jn Situ 
Hybridization: Protocols And Applications", Raven Press. NY). 



Alternatively, if a sufficient quantity of the appropriate cells can be obtained, stan- 
dard Northern analysis can be performed to determine the level of mRNA expres- 
1 0 sion of the RAI gene. 



Activity of the gene 

Another method for detecting sequence polymorphism is by analysing the activity of 
gene products resulting from the sequences. Accordingly, in one embodiment the * 
detection uses the activity of the RAI gene product as compared to a reference in 
the method. In particular if the activity of the genes are decreased or increased by at 
least or about 50 %, such as at least or about 40%, tor example at least or about 
30%. such as at least or about 20%, for example at least or about 10%, such as at 
least or about 10%, for example at least or about 5%, such as at least or about 2%, 
it indicates a sequence polymorphism in the gene. 



Mutations outside the region 



The present invention may combine the result of sequence polymorphism within the 
region r with sequence polymorphism outside the region in order to increase the 
probability of the correlation. 



Primers 

30 

The primers nucleotide sequences of the Invention further include: (a) any nucleo- 
tide sequence that hybridizes to a nucleic acid molecule of the region r or its com- 
plementary sequence or RNA products under stringent conditions, e.g., hybridization 
to filter-bound DNA in 6.times. sodium chloride/sodium citrate (SSC) at about 
35 45.degree. C. followed by one or more washes in 0.2.tfmes.SSC/0.1 % SDS at about 
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50-65.degree. C. or (b) under highly stringent conditions, e.g., hybridization to filter- 
bound nucleic acid in atimes.SSC at about 45.degree. C. followed by one or more 
washes in 0.1.times.SSC/0,2% SDS at about 68.degree. C, or under other hybridi- 
zation conditions which are apparent to those of skill in the art (see, for example, 
Ausubel F. M. et a!., eds., 1989. Current Protocols in Molecular Biology, Vol. I, 
Green Publishing Associates, Inc., and John Wiley & sons. Inc., New Yortc, at pp. 
6.3.1-6.3.6 and 2.10.3). Preferably the nucleic add molecule that hybridizes to the 
nucleotide sequence of (a) and (b). above, is one that comprises the complement of 
a nucleic acid molecule of the region r or a complementary sequence or RNA prod- 
uct thereof. In a preferred embodiment, nucleic add molecules comprising the nu- 
cleotide sequences of (a) and (b), comprises nucleic add molecule of RAJ or a com- 
plementary sequence or RNA product thereof. 

Among the nucleic acid molecules of the invention are deoxyoligonudeotides ("oli- 
gos") which hybridize under highly stringent or stringent conditions to the nudelc 
add molecules described above. In general, for probes between 14 and 70 nudeo- 
tides in length the melting temperature (TM) is calculated using the formula: 

Tm(.degree.C0=81.6r*:16.6(log [monovalent cations (molar)])+0.41(% G+C)-(5Q0/N) 

where N is the length of the probe. If the hybridization is carried out In a solution 
containing formamide, the melting temperature is calculated using the equation 
Tm(.degree. C.)=81 .5+16.6(loglmonovalent cations (mo!ar)])+0.41(% G+C)-(0.61 % 
formamideK500/N) where N is the length of the probe. In general, hybridization is 
carried out at about 20-25 degrees below Tm (for DNA-DNA hybrids) or 10-15 de- 
grees below Tm (for RNA-DNA"*hybrids). 

Exemplary highly stringent conditions may refer, e.g. f to washing in 
6.times.SSC/0.05% sodium pyrophosphate at37.degree. C. (tor about 14-base oB- 
gos), 48.degree. C. (for about 17-base otigos), 55.degree. C. (for about 20-base 
oiigos), and BO.degree. C. (for about 23-base oligos). 

Accordingly, the invention further provides nucleotide primers or probes which de- 
tect the r region polymorphisms of the invention. The assessment may be conducted 
by means of at least one nudeic add primer or probe, such as a primer or probe of 



22 



27/06 '02 13; 06 FAX +4S_ 33^ ^g4 HOI BERG A PS + I^^f-OG VAREMJE (21023 

P687 DKCO 

20 

DNA, RNA or a nucleic acid analogue such as peptide nucleic acid (PNA) or locked 
nudeic acid (UNA). The nucleotide primer or probe is preferably capable of hybrid- 
ising to a subsequence of the region corresponding to SEQ ID NO: 1 , or a part 
thereof, or a region complementary to SEQ ID NO:1 . 

5 

According to one aspect of the present invention there is provided an allele-specific 
oligonucleotide probe capable of detecting a r region polymorphism at one or more 
of positions in the r region as defined by the positions in SEQ ID NO; 1. 

10 The allele-specific oligonucleotide probe is preferably 5-50 nucleotides, more pref- 
erably about 5-35 nucleotides, more preferably about 5-30 nucleotides, more pref- 
erably at least 9 nucleotides. 

The design of such probes will be apparent to the molecular biologist of ordinary 
15 skill. Such probes are of any convenient length such as up to 50 bases, up to 40 
bases, more conveniently up to 30 bases in length, such as for example 8-25 or 8- 
15 bases In length. In general such-probes will comprise base sequences entirely 
complementary to the corresponding wild type or variant locus in the region. How- 
ever, if required one or more mismatches may be introduced, provided that the dis- 
20 criminatory power of the oligonucleotide probe Is not unduly affected. The probes of 
the invention may carry one or more labels to facilitate detection. 

In one embodiment, the primers and/or probes are capable of hybridizing to a sub- 
sequence selected from the group of subsequences below, wherein the polymor- 
25 phism is denoted as for example T/C: 

1 . GCTCTGAAAC TTACTAGCCC(A/G)GTATTTAT6G AGAGGCATTT 

2. GTGGTCAAAT TCTCATTCAT CGTGG (T/C) CCAGGCAAGC 
ACACTTCCTC 

30 3. ACCCTGAGGT GAGCACCTGT TCCTT(CVT) TCCTTGCCCT TAGCCCA- 

GAG GTAGA 

4. GGGCAGGGGT TTGTGCCTCC AATGA (G/A) CACAAGCTCC 
CCCTGCCCCC CAACT 

5. CCTGGCGGTG GCCGTCACCA GCTTT (T/C) GGGGGTGTTT 
35 GGGAAGCTGG 
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6. CTCCAGCCCC ACTGTTCCCT (A/G) GGCCCTATTG GTCCCCCTGG 

7. ACAAGGAGGA GGCAGAAGTG AGGTT (G/C) AAACCCACTG CCCAATC- 
TTA 

8. CCAACACGGT GAAACCCCGT CTGTA(T/C)TAAAAATACA AAAATTAGCC 
S 9. AATCCAGGAC CCCATAATCTTCCGT (C/T) ATCTAAAACA ATA- 

ATGGTGA 

10. CCCAAGGGGG CGAGGGGAGG GTGAA (A/G)GGGTGGGACG 
GGGGCAGCCG 

11. GAAGTGAGAA GGGGGCTGGG GGTCG (G/-) CGCTCGCTAG 
10 CGGGCGCGGG 

. 12. CGCACGCGCA GTATCCCGAT TGGCT (C/G)TGCCCTAGCG GATT- 
GACGGG 

13. AACTCCTGGG TTCGATCAAT ACTCA (GACA/-) ATCTTGGCAG 
GCGCAGGAGG 

1 5 14. GCTGGGATTA CAGGCTTGAG CCACC (A/G) CGCCCGGCCT 

GCAAAGCCAT. 

15. TTTTGTATCT TTAGTAGAGA GAGG (T/G) TTTCTCCATG TTGGTCAGGC 

16. GCCTCAGCCT CCCGAGTAGC TGAGACT (C/A) CAGGTGCCCG CCAC- 
CACGCC 

20 17. TGAAATTGTA GGTTGAGAGG CCAGGCG (CrT) GGTGCTCACG 

CCTQTAATTT 

18. GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

19. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

20. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 
25 21. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

22. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

23. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

24. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 

25. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCCA 
30 GCTGTTTCCC 

26. GCTGTTTCCC ACCCCATAAG GCA (A/G) TAGGGGAGCC 
CACCTCCGCC 

27. GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAGG 

28. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGCC 
35 ACCGTCTCGC 
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29. GGGAGGAGTC GCOGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 

30. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 

31 . TAGAAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 

32. ACAGGAGAGG GAAGGTTTTTTG (A/T) TTTTTTT7TT GTTTTTTTTT 
5 33. GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA- 

GAAG 

34. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 
CAGCT 

35. TTGAGACTCT CTGTTTGAT (A/G) CTTCACTCAG AAGGTGCTTC 
10 36. AGGCCAGGCT CCTGCTGGCT G (C/G) GCTGGTGCAG TCTCTGGGGA 

37. CCCCTATACC CTCAAGCAT (C/T) TATCCATTGA GTTACAAACA 

38. ACCATCGCCC GCCTTCCGTT (A/C) GTCCGGCCCC CGAGGCTAGC 



15 In another embodiment, the primers and/or probes are capable of hybridizing to a 
subsequence selected from the group of subsequences below: 

I. TGAAATTGTA GGTTGAGAGG CCAGGCG (C/T) GGTGCTCACG 
CCTGTAATTT 

20 2. GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

3. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

4. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

5. TCCCAAGTTT CAGGGCCCAA <J/G) ATTCTCAAAT CACAGGATTC 

6. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 
25 7. TC7TAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

8. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 

9. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCCA 
GCTGTTTCCC 

10. GCTGTTTCCC ACCCCATAAG GCA (A/G) TAGGGGAGCC 
30 CACCTCCGCC 

II. GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAGG 

12. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGCC 
ACCGTCTCGC 

13. GGGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 
35 1 4. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 
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15. TAGAAATACT AACAAA6GGC (T/C) GTGGGTTTCT CCCCCTGCTT 

16. ACAGGAGAGG GAAGGITI 1 1 IG (A/T) TTTTTTTTn G ill I UN I 
17- GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA- 

GAAG 

5 18. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 

CAGCT 

In yet another embodiment, the primers and/or probes are capable or hybridizing to 
a subsequence selected from the group of subsequences below 

10 

1. GTTTATAAAC ATTAAACCAG (T7A) GCTGTGTGAA GGCACTTAAT 

2. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

3. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

4. TCCCAAGTTT CAGGGCCGAA (T/G) ATTCTCAAAT CACAGGATTC 
15 5. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

It is preferred In one embodiment that at least one sequence polymorphism Is as- 
sessed in a region corresponding to SEQ ID NO: 1 position 1521-37752 (r). includ- 
ing at least one sequence polymorphism assessed in a region corresponding to 
20 SEQ ID NO: 1 position 7760-22885. 

In another embodiment, the methods of the Invention relates to at least one se- 
quence polymorphism Is assessed in a region corresponding to SEQ ID NO: 1 posi- 
tion 34391-37683, ending with the coding region of ASE-1 (cagcctgtgtag), where tag 
25 is the stop codon. 

In a preferred embodiment the primers or probes are selected from one or more of 
the following: 

30 TGGCTAACACGGTGAAACC(SEQ ID NO:7) 

GGAATCCAAAGATTCTATGATGG(SEQ ID NO:8) 

GGGAGGCGGAGCTTGCAGTGA (SEQ ID NO:9) 

CTGAGATCGCACCACTGCAC (SEQ ID NO:10) 

GGTTTTCTGCTCTGCACACG (SEQ ID NO:1 1) 
35 CCTTTCTCCTTCCACCAACG (SEQ ID NO:12) 
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CGGQCTACAGGGTTACCTGAG (SEQ ID NO:13) 
TCTGCAACCTGGTGCGAGCAGC (SEQ ID NO:14) 
CCTACCACCATCATCACATCC (SEQ ID NO:15) 
GCCTTGCCAAAAATCATAACC (SEQ ID NO:16) 
CCTCTCCCCAATTAAGTGCCTTCACACAGC (SEQ ID NO:17) 
AGCCAGGGAGOTTGAGGCT (SEQ ID NO:18) 
AGACAGCCCTGAATCAGCAC (SEQ ID NO:19) 
GCAATGAGCCGAGATAGAA (SEQ ID NO:20) 
TX5GCTAGCCCATTACTCTA (SEQ ID NO:21 ) 



10 



According to another aspect of the present invention there is provided a diagnostic 
nucleic acid primer capable of detecting a r region polymorphism at one or more of 
positions in the r region as defined by the in SEQ ID NO: 1. 

15 The primer or probe may be a diagnostic nucleic acid primer defined as an allele 
specific primer, used, generally together with a constant primer. In an amplification 
reaction such as a PCR reaction, which provides the discrimination between alleles 
through selective amplification of one allele at a particular sequence position. The 
diagnostic primer is preferably 5-50 nucleotides, more preferably about 5-35 nucieo- 

20 tides, more preferably about 5-30 nucleotides, more preferably at least 9 nucleo- 
tides. 

In accordance with the present invention diagnostic primers are provided, compris- 
ing the sequences set out below as well as derivatives thereof wherein about 6-8 of 
25 the nucleotides at the 3 1 terminus are identical to the sequences given below and 
wherein up to 10, such as up to 8, 6, 4, 2, or 1 of the remaining nucleotides may be 
varied without significantly affecting the properties of the diagnostic primer. Con- 
veniently, the sequence of the diagnostic primer is as written below. 

30 Furthermore, as described above at least two sets of primer(s) and/or probe(s) may 
be combined in the method thereby Increasing the correlation probability. This sec- 
ond or other set of prfmer(s) and/or probe(s) may be nucleotide or nucleotide ana- 
logues hybridising to a region within the region r or to a sequence different from the 
region r. Said sequence different from the region r is preferably a region In chromo- 
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some 19, preferably in chromosome 19q. In particular such second or other primer 
or probe may be selected from one or more of the sequences below, or trie com- 
plementary -strands: 

5 GCCCCGTCCCAGGTA (SEQ ID NO:21 ) 

AGCCCCAAGACCCTTTCACT (SEQ ID NQ.22) 

GTCCCATAGATAGGAGTGAAAG (SEQ ID NO:23) 

CCCTAGGACACAGGAGCACA (SEQ ID NO:24) 

TTGTGCTTTCTCTGTGTCCA (SEQ ID NO:25) 
1 0 TATCAGAAAAGGCTGGAGGA (SEQ ID NO:26) 

GAGTGGCTGGGGAGTAGGA (SEQ ID NO:27) 
|) GCCAAGCAGAAGAGACAAA (SEQ ID NO:28) 

CCTCAGATGTCCTCTGCTCA (SEQ ID NO:29) 

GCCACAGCCCCAGCAAGTAG (SEQ ID NO:30) 
15 AGGACCACAGGAGACGCAGA(SEQIDNO:31) 

CATAGAACAGTCCAGAACAC (SEQ ID NO:32) 

TTAGCTTGGCACGGCTGTCCAAGGA (SEQ ID NO:33) 

ACAGAATTCGCCCCGGCCTGGTACAC (SEQ ID NO:34) 

TTGAAACTGGAACTCTGAGAAGG (SEQ ID NO:35> 
20 TGGTGG ATGGTGTGAAG CA (SEQ ID NO:36) 

CCTTTCTCCAACTTCTTCTCCATTTCCACC (SEQ ID NO:37) 

GGGGATCATGTCGTCAATGGACT (SEQ ID NO:38) 

ATGCCCTGTAGGTTCAATGG (SEQ ID NQ.39) 

TGGAGGTCTTTAGGGGCTTG (SEQ ID NO:40) 
25 GGCTGGTCCCCGTCTTCTCCTTCC (SEQ ID NO:41) 
^ TCTCTGTTGCCACTTCAGCCTC (SEQ ID NO:42) 

GTCCTGCCCTCAGCAAAGAGAA (SEQ ID NO:43) 

TTCTCCTGCGATTAAAGGCTGT (SEQ ID NO:44) 

ATCCTGTCCCTACTGGCCATTC (SEQ ID NO:45) 
30 TGTGGACGTGACAGTGAGAAAT (SEQ ID NO:46) 

TGGAGTGCTATGGCACGATCTCT (SEQ ID NO:47) 

CCATGGGCATCAAATTCCTGGGA (SEQ ID NO:48) 

CACACCTGGCTCAI I I I IGTAT (SEQ ID NO:49) 

TCATCCAGGTTGTAGATGCCA (SEQ ID NO:50) 
35 AGGCTCAACAAGGAAAAATGC (SEQ ID IMO:51 ) 
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GCTAGACAGTCAAGGAGGGACG (SEQ ID N0:52) 
AAAGGGTGGGTGTGGGAGACATTGG (SEQ ID NO:53) 
AAACCAACCTAGGCACCCCAAA (SEQ ID NO:54) 
CAGTGTCCAAAGAGCACC (SEQ ID NO:55) 
5 CTACCCCTTTAGCGAGC(SEQIDNO:56) 

TCCTGCCCCCAGAGCGTCACC (SEQ ID NO:57) 
GTACGGTCCACATAATTTTGGAGGA (SEQ ID NO:58) 
CGACGAACTTCTCTGAAGCGAA (SEQ ID NO:59) 
AGCGACACGGGCATCTGG (SEQ ID NO:60) 
10 ATGAGCGTCCACCTCCTGAACC (SEQ ID NO:61) 
AGGCAGCAGCATCGTCATCCCC (SEQ ID NO:62) 
TGCATAGCTAGGTCCTGC (SEQ ID NO:63) 

AACTGACRAAACTAGCTCTATGGGGTGGTGCCGCA(SEQ ID NO:64) 

CTGGCTCTGAAACTTACTAGCCC (SEQ ID WO:65) 
1 5 GCTGGACTGTGACCGCATG (SEQ ID NO:66) 

GGAGCAGGGTTGGCGTG (SEQ ID NO:67) 

TGCCCTCCCAGAGGTAAGGCCT (SEQ ID NO:68> 

CCCTCCCGGAGGTAAGGCCTC (SEQ ID NO:69) 

GATCAAAGAGAGAGACGAGC (SEQ ID NO:70) 
20 GAAGCCCAGGAAATGC (SEQ ID NO:71} 

GGACGCCCACCTGGCCAACC (SEQ ID NO: 72} 

CGT6CTGCCCAACGAAGTG (SEQ ID NO:73) 



25 The primers and probes may be manufactured using any convenient method of 

synthesis. Examples of such methods may be found in standard textbooks, for ex- 
ample "Protocols for Oligonucleotides and Analogues; Synthesis and Properties,'' 
Methods in Molecular Biology Series; Volume 20; Ed. Sudhir Agrawal, Humana 
ISBN: 0-89603-247-7; 1993; 1.sup.st Edition. If required the primer(s) and probe(s) 

30 may be labelled to facilitate detection. 

Kit 
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According to another aspect of the present Invention there Is provided a diagnostic 
kit comprising at least one diagnostic primer of the invention and/or at least one al- 
fele-specific oligonucleotide primer of the Invention. 

5 The diagnostic kits may comprise appropriate packaging and Instructions for use in 
the methods of the Invention. Such kits may further comprise appropriate buffers) 
and polymerase(s) such as thermostable polymerases, for example taq polymerase. 

Preferred kits can comprise means for amplifying the relevant sequence such as 
10 primers, polymerase, deoxynucieotides, buffer, metal ions; and/or means for dis- 
criminating the polymorphism, such as one or a set of probes hybridising to the 
0 polymorphic site, a sequence reaction covering the polymorphic site, an enzyme or 

an antibody; and/or a secondary amplification system, such as enzyme-conjugated 
antibodies, or fluorescent antibodiesThe kit-of-parts preferably also comprises a 
1 5 detection system, such as a fluorometer, a film, an enzyme reagent or another 
highly sensitive detection device. 

The methods described herein may be performed, for example, by utilizing pre- 
packaged diagnostic kits. The invention therefore also encompasses kits for detect- 

20 ing the presence of a polypeptide or nucleic acid of the invention in a biological 

sample (I.e., a test sample). Such kits can be used, e.g., to determine if a subject is 
suffering from or is at increased risk of developing a disorder associated with a dis- 
order-causing allele, or aberrant expression or activity of a polypeptide of the inven- 
. ton. For example, the kit can comprise a labeled compound or agent capable of 

25 detecting the polypeptide or mRNA or ONA or RAI gene sequences, e.g., encoding 
J) the polypeptide in a biological sample. The kit can further comprise a means for de- 

termining the amount of the polypeptide or mRNA In the sample (e.g., an antibody 
which binds the polypeptide or an oligonucleotide probe which binds to DNA or 
mRNA encoding the polypeptide). Kits can also include Instructions for observing * 

30 that the tested subject is suffering from or is at risk of developing a disorder associ- 
ated with aberrant expression of the polypeptide if the amount of the polypeptide or 
mRNA encoding the polypeptide is above or below a normal level, or if the DNA - 
correlates with presence of a RAJ allele that causes a disorder. 
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For antibody-based kits, the kit can comprise, for example: (1) a first antibody (e.g., 
attached to a solid support) which binds to a polypeptide of the invention; and, op- 
tionally, (2) a second, different antibody which binds to either the polypeptide or to 
the first antibody and is conjugated to a detectable agent. 

Identification of an allel as having implication for risk of cancer 



5 



An allele In the r region can be identified as correlated with an Increased risk of de- 
veloping cancer on the basis of statistical analyses of the Incidence of a particular 
10 allele in two groups of individuals with and without cancer, respectively, according to 
the x 2 test , which is well known in the art Furthermore, an allele in the region can 
be identified as an allele correlated with prognosis of cancer on the basis of statisti- 
cal analyses of the incidence of a particular allele in individuals demonstrating dif- 
ferent prognostic characteristics. 

15 - 

Identification of humans having Increased likelihood of responding to treat- 
ment 

20 it is further contemplated that the present invention provides a method for identifying 
a human subject as having an Increased likelihood of responding positively to a 
cancer treatment, comprising determining the presence in the subject of a r region 
allele genotype correlated with an Increased likelihood of positive response to treat- 
ment, whereby the presence of the genotype Identifies the subject as having an in- 

25 creased likelihood of responding to cancer treatment 

The treatment mentioned herein may be any cancer treatment, such as conventional 
cancer treatment, for example X-ray, chemotherapeutfcs, surgical excision or com- 
binations thereof. 



30 



Protein Products of the Gene(s) 



Gene products of the region r or peptide fragments thereof, can be prepared for a 
variety of uses. For example, such gene products, or peptide fragments thereof, can 
35 be used for the generation of antibodies, in diagnostic assays. 
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The gene products of the invention include, but are not limited to, human RAI gene 
products, and ASE-1 gene products. In the following the invention is described in 
relation to RAI gene product 

5 

Gene product, sometimes referred to herein as an "protein- or B polypeptide," In- 
cludes those gene products encoded by the RAI gene sequences shown as position 
7821-21350 in SEQ ID NO: 1. Among gene product variants are gene products 
comprising amino acid residues encoded by the polymorphisms. Such gene product 
10 variants also include a variant of the RAI gene product 

In addition, RAI gene products may include proteins that represent functionally 
equivalent gene products, tn preferred embodiments, such functionally equivalent 
RAI gene products ana naturally occuring gene products. Functionally equivalent 
15 RAI gene products also include gene products that retain at least one of the biologi- ' 
cal activities of the RAI gene products described above, and/or which are recog- 
nized by and bind to antibodies (polyclonal or monoclonal) directed against RAI 
gene products. 

20 Antibodies to Gene Products 

Described herein are methods for the production of antibodies capable of specifi- 
cally recognizing one or more gene product epitopes or epitopes of conserved vari- 
ants or peptide fragments of the gene products. Further, antibodies that specifically 
25 recognize mutant forms are encompassed by the invention. The terms "specifically 
bind" and "specifically recognize" refer to antibodies that bind to RAI gene product 
epitopes at a higher affinity than they bind to non-RAI (e.g., random) epitopes. 

Such antibodies may include, but are not limited to, polyclonal antibodies, mono- 
30 clonal antibodies (mAbs), humanized or chimeric antibodies, single chain antibodies, 
Fab fragments, F(ab') 2 fragments, fragments produced by a Fab expression library, 
anti-ldiotypic (anti-Id) antibodies, and epftope-bindlng fragments of any of the above, 
including the polyclonal and monoclonal antibodies described below. Such antibod- 
ies may be used, for example, in the detection of a gene product in an biological 
35 sample and may, therefore, be utilized as part of a diagnostic or prognostic tech- 
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nique whereby patients may be tested for abnormal levels of gene products, and/or 
for the presence of abnormal forms of such gene products. Such antibodies may 
also be utilized in conjunction with, for example, compound screening schemes; as 
described, below, for the evaluation of the effect of test compounds on gene product 
5 levels and/or activity. 

For the production of antibodies against a gene product, various host animals may 
be immunized by injection with a RAI gene product, or a portion thereof. Such host 
animals may include, but are not limited to rabbits, mice, and rats, to name but a 

1 0 few. Various adjuvants may be used to increase the immunological response, de- 
pending on the host species, including but not limited to Freund's (complete and 
Incomplete), mineral gels such as aluminum hydroxide, surface active substances 
such as lysoledthin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole 
limpet hemocyanin, dinftrophenol, and potentially useful human adjuvants such as 

15 BCG (bacille Calmette-Guerin) and Corynebacterium parvum. 

Polyclonal antibodies are heterogeneous populations of antibody molecules derived 
from the sera of animafs immunized with an antigen, such as a gene product, or an 
antigenic functional derivative thereof. For the production of polyclonal antibodies, 
20 host animals such as those described above, may be immunized by injection with 
gene product supplemented with adjuvants as also described above. 

Monoclonal antibodies, which are homogeneous populations of antibodies to a par- 
ticular antigen, may be obtained by any technique that provides for the production of 

25 antibody molecules by^continuous cell lines In culture. These include, but are not 
limited to, the hybridoma technique of Kohler and Milstein, (1975, Nature 256:495- 
497; and U.S. Pat No. 4,376,110), the human B-cell hybridoma technique (Kosbor 
et al. 1983, Immunology Today 4:72; Cole et aL, 1983, Proc. Natl. Acad. Sci. U.S:A. 
60:2026-2030), and the EBV-hybridoma technique (Cole et al. 1985. Monoclonal 

30 Antibodies And Cancer Therapy, Alan Ft Liss, Inc., pp. 77-96). Such antibodies may 
be of any immunoglobulin class including IgG, IgM. IgE. IgA, IgD and any subclass 
thereof. The hybridoma producing the mAb of this invention may be cultivated in 
vitro or in vivo. Production of high titers of mAbs In vivo makes this the presently 
preferred method of production. 
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In addition, techniques developed for the production of "chimeric antibodies" (Morri- 
son, et al., 1984, Proc. Nad. Acad. Sci., 81:6851-6855; Neuberger. et aL, 1984, Na- 
ture 312:604-608; Takeda, et al M 1985 f Nature, 314:452-464) by splicing the genes 
from a mouse antibody molecule of appropriate antigen specificity together with 
5 genes from a human antibody molecule of appropriate biological activity can be 

used. A chimeric antibody is a molecule in which different portions are derived from 
different animal species, such as those having a variable region derived from a 
murine mAb and a human immunoglobulin constant region. (See, e.g., Cabilly et al. t 
U.S. Pat No. 4,816,567; and Boss et al, U.S. Pat No. 4,816397, which are Inoorpo* 
1 0 rated herein by reference in their entirety.) 

In addition, techniques have been developed for the production of humanized anti- 
bodies. (See, e.g., Queen, U.S. Pat No. 5,585,089, which is incorporated herein by 
reference in its entirety.) An immunoglobulin light or heavy chain variable region 

: 15 consists of a "framework" region interrupted by three hypervarlabte regions, referred 
to as complementarity determining regions (CDRs). The extent of the framework 
region and CDRs have been precisely defined (see, "Sequences of Proteins of Im- 
munologicaf Interest", Kabat, E. et al., U.S. Department of Health and Human Serv- 
ices (1983) ). Briefly, humanized antibodies are antibody molecules from non-human 

20 species having one or more CDRs from the non-human species and a framework 
region from a human immunoglobulin molecule. 

Alternatively, techniques described for the production of single chain antibodies 
(U.S. Pat No. 4.946,778; Bird. 1988, Science 242:423-426; Huston, etaL, 1986. 
25 Procx Natl. Acad. Sti. U.S.A. 85:5879-5863; and Ward, et aL. 1989, Nature 334:544- 
546) can be adapted to produce single chain antibodies against gene products. Sin- 
gle chain antibodies are formed by Unking the heavy and light chain fragments of the 
Fv region via an amino add bridge, resulting in a single chain polypeptide. 

30 Antibody fragments that recognize specific epitopes may be generated by known 
techniques. For example, such fragments include but are not limited to: the F(ab , ) z 
fragments, which can be produced by pepsin digestion of the antibody molecule and 
the Fab fragments, which can be generated by reducing the disulfide bridges of the 
F(ab')2 fragments. Alternatively, Fab expression libraries may be constructed (Huse, 
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el al. r 1989, Science 246:1275-1281) to allow rapid and easy Identification of mono- 
clonal Fab fragments with the desired specificity. 

Immunoassays for gene products, conserved variants, or peptide fragments thereof 
5 will typically comprise Incubating a sample, such as a biological fluid, a tissue ex- 
tract, freshly harvested cells, or lysates of cells In the presence of a detectably la- 
beled antibody capable of Identifying gene product, conserved variants or peptide 
fragments thereof, and detecting the bound antibody by any of a number of tech- 
niques well-known In the art. 

'10 

The biological sample may be brought In contact with and immobilized onto a solid 
£ phase support or carrier, such as nitrocellulose, that is capable of Immobilizing ceils, 

ceil particles or soluble proteins. The support may then be washed with suitable 
buffers followed by treatment with the detectably labeled gene product specific anti- 
15 body. The solid phase support may then be washed with the buffer a second time to 
remove unbound antibody. The amount of bound label on the solid support may 
then be detected by conventional means. 

By "solid phase support or carrier" is intended any support capable of binding an 
20 antigen or an antibody. Well-known supports or carriers include glass, polystyrene, 

polypropylene, polyethylene, dextran, nylon, amylases, natural and modified cellulo- . 
ses, polyacryiamides, gabbros. and magnetite. The nature of the carrier can be ei- 
ther soluble to some extent or insoluble for the purposes of the present invention. 
The support material may have virtually any possible structural configuration so long 
25 as the coupled molecule is capable of binding to an antigen or antibody. Thus, the 
0 support configuration may be spherical, as in a bead, or cylindrical, as in the Inside 

surface of a test tube, or the external surface of a rod. Alternatively, the surface may 
be flat such as a sheet, test strip, etc. Preferred supports include polystyrene beads. 
Those skilled in the art will know many other suitable carriers for binding antibody or 
30 antigen, or will be able to ascertain the same by use of routine experimentation. 

One of the ways in which the RA1 gene product-specific antibody can be detectably 
labeled is by linking the same to an enzyme, malate dehydrogenase, staphylococcal 
nuclease, delta-5- steroid isomerase, yeast alcohol dehydrogenase, .alpha.- 
35 glycerophosphate, dehydrogenase, triose phosphate isomerase, horseradish per- 
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oxidase, alkaline phosphatase, asparaginase, glucose oxidase, .beta.- 
ga!actosldase v ribonudease, urease, catalase. glucose-6-phosphate dehydroge- 
nase, glucoamylase and acetylcholinesterase. The detection can be accomplished 
by oolorimetric methods that employ a cfiromogenfc substrate for the enzyme. De- , 
5 taction may also be accomplished by visual comparison of the extent of enzymatic 
reaction of a substrate in comparison with similarly prepared standards. 

Detection may also be accomplished using any of a variety of other immunoassays. 
For example, by radioactively labeling the antibodies or antibody fragments, by la- 
1 0 beling the antibody with a fluorescent compound. Among the most commonly used 
fluorescent labeling compounds are fluorescein isothiocyanate, rhodamine, phyco- 
erythrin, phycocyanin, atlophycocyanln, o-phthaldehyde and fiuorescamine. 

The antibody can also be detectably labeled using fluorescence emitting metals ' 
1 5 such as .sup. 1 52 Eu, or others of the lanthanide series or by coupling it to a chemi- 
luminescent compound. 

Diseases 

20 Described herein are various applications of gene sequences, gene products, in- 
cluding peptide fragments and fusion proteins thereof, and of antibodies directed 
against gene products and peptide fragments thereof. Such applications include s for 
example, prognostic end diagnostic evaluation of cancer and the identification of 
subjects with a predisposition to such disorders, as described above. 

25 

The method according to the invention may be used in relation to any cancer form, 
such as but not limited to skin carcinoma including malignant melanoma, breast 
cancer, fung cancer, coton cancer and other cancers in the gastrointestinal tract, 
prostate cancer, lymphoma, leukemia, pancreas cancer, head and neck cancer, 
30 ovary cancer and otHer gynecological cancers. In particular the method is relevant 
for skin cancer, lung cancer, colon cancer and breast cancer, such as skin cancer 
and breast cancer, preferably wherein the skin cancer is basal cell carcinoma. 

Gene nucleic acid sequences, descnbed above, can be utilized for transferring re- 
35 combinant nucleic acid sequences to cells and expressing said sequences in recipi- 
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ent cells. Such techniques can be used, for example, in marking cells or for the 
treatment of cancer. Such treatment can be in the form of gene replacement ther- 
apy. Specifically, one or more copies of a normal RAI gene or a portion of the RAJ 
gene that directs the production of a RAI gene product exhibiting normal RAI gene 
function, may be inserted into the appropriate cells within a patient, using vectors 
that Include, but are not limited to adenovirus, adeno-associated virus, and retrovi- 
rus vectors, in addition to other particles that Introduce DMA into cells, such as lipo- 
somes. 

Examples 

The examples relates to prediction from sequence polymorphisms in the region rjto 
cancer. Blood was collected before (exampe 6) or after (example 1 through 5) the 
persons acquired cancer.. However, the sampling time is considered immaterial, as 
DNA in a polyclonal Wood sample is not expected to change over time. 

The particular sequence polymorphisms analysed in tf\es.e examples are listed in 
Table 6. together with their sources of information and their definition as sequences. 

Table 6. The markers used, their sources of information, and their currently esti- 
mated positions on chromosome 19, as well as their position In figure 2. 
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Source of Position In 
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Position 




identification 


sequence 
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Position 


In Figure 








Number of 


se- (Mbases) 


2 






quence 




• 


XRCC1 e10 . 


Ref. 1 


28152 


L34079 


59.420 


1 


CKMeS 


rs#8188 


20076 


AC005781 


61.361 


2 


XPDe23 


Ref. 1 


35931 


L47234 


61.479 


3 


XPDelO 


Ref.1 


23591 


L47234 


61.491 


4 


XPDe6 


Ref. 1 


22541 


L47234 


62.4923 


5 


XPDI4 


rs#1618538 


19244 


L47234 


61.4924 


6 


RAI e6 


re#6966 


8786 


L47234 


61.506 


7 


RAJ 11 


rs#1970764 


875 


L47234 


61.514 


8 


ASE1 e1 


rs#967591 


232125 


NT_011242 


61.534- 


9 


ERCC1 e4 


Ref. 1 


19007 


MQ3796 


61.547 


10 


FOSB e4 


rs#104S698 


34621 


M88851 


61.601 


.11 


SLC1A5 e8 


rs#1 060043 


60620 


AC008622 


62.946 


12 


GLTSCR1 e'1 


rs#1 035938 


20775 


AC010519 


63.986 


13 


UG1e6 


rs#20580 


111 


L27710 


65.460 


14 



rs numbers were derived from the NCBf's database dbSNP. 
Ref 1; Shen, M.R, Jones, I.ML, and Mohrenwelser. H. (1998) Nonconservative 
amino acid substitution variants exist at polymorphic frequency in DIMA repair genes 
in heaithy humans. Cancer Res., 58: 604-8, 1 098. 
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Study groups. The groups of Caucasian Americans with and without BCC have 
been described previously (Athas et al, Cancer Res. 51:5786-5793, 1991; 
,Wei et al, Proc. Natl. Acad. Scl USA, 90: 1614-8, 1.994.). Briefly, the study was 

a dime based case control study at the Johns Hopkins Hospital, which serves 

multiple participating dermatologists In Maryland. Cases were histo- 

pathologically confirmed primary BCCs and were diagnosed between 1987- 

1990. The controls were patients from the same physician practices and had a 

diagnosis of mild skin disorders. All participants were Caucasians living near 
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Baltimore and were between 20 and 60 years of age. The controls were fre- 
quency matched to the cases by age and sex. Cases and controls wllh any other 
forms of cancer were excluded. In the questionnaire, the study subjects were 
asked if they had any Wood relatives with skin cancer, and were asked to specify 
5 the type of cancer. Study subjects with relatives with basal cell carcinoma and 

squamous cell carcinoma and 'skin cancer* were Included In the group of sub- 
jects with a family of skin cancer. Subjects with relatives with melanoma were 
not included. At the.clinic visit the subjects gave informed consent, were exam- 
ined by dermatologists, completed a structured questionnaire and provided 
1 0 blood. DNAs from available frozen lymphocytes were purified using Puregene 

(Sentra Systems) and were genotyped.initfally, 71 cases and 1 18 controls were 
included in this study. However, the number of persons varied between analy- 
ses, as the supply of DNAs gradually was depleted. In case of the SNP RAl 11 
only 133 persons could genotyped reliably. 

15 

The groups of 20 psoriatic Danes with and 20 psoriatic Danes without BCC have 
been described previously (Dybdahl et al, Cancer Epidemiol. Biomarkers Prev.. 
8:77-81, 1999). Briefly, BCC subjects were identified from a population-based cohort 
of persons treated by Danish dermatologists in the year 1995, and fulfilled the fbl- 

20 lowing criteria (a) age in 1995 < 50 years; and (b) clinically verified diagnosis of pso- 
riasis. The diagnosis of BCC was clinically and histologically confirmed. The controls 
consisting of psoriasis cases without BCC was selected from among patients treated 
in the year 1992-1995 for psoriasis by dermatologists who participated in the na- 
tional cohort study 1995. The controls were matched by age and sex. The patients 

25 with psoriasis and BCC differed from the national cohort of BCC in that the average 
of first BCC was 38 year against 56 year in the cohort. A number of cases had had 
multiple BCCs. There was a tendency that cases had been treated for a longer time 
than the controls, and also that the treatments were more intense. This was to be 
expected as treatment of psoriasis involves a number of carcinogenic treatment mo- 

30 dalities. DNAs from available frozen lymphocytes were purified using Puregene 
(Gentra Systems) and were genotyped. 
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Primers end probes. Table 7 includes the polymorphisms typed on Ughtcycler™, the 
primers used for the PCR reaction and the probes used for detection and typing of 
the PCR products. Table 8 lists the polymorphisms typed by conventional PCR- 
RFLP, and the primers and restriction enzymes used. Table 9 lists the polymor- 
phisms typed by SNaPshot technology and the primers used. Table 10 lists the 
polymorphisms analyzed on a Taqman, and the primeis and probes used. Hobolth 
DNA, Hillerdd, Denmark or DNA Technology, Aarhus, Denmark, synthesized the 
primers In table 7, 8, and 9. TIB Mol-Biol, Bertin Germany synthesized the Lighten 
c!er probes. TAG-Copenhagen ApS (Tagc.com, Copenhagen, Denmark) synthe- 
sized the primers, and Applied Biosystem synthesized the fluorescent Taqman 
probes in table 10. 

Table 7. Design of primers and fluorogenic probes for UghtCycler 
___ ~ — — 

Forward primer 5'-GGTTTTCTGCTCTGCACACG 
Reverse primer 5-CCTTTCTCCTTCCACCAACG 
Anchor probe: ff-TCTGCAACCTGGTGCGAGCAGC-Fluoresceln 
Sensor probe; 5'-LCRed64G-CGGGCTACAGGGTTACGTGAG-p 
CKMeB 

Forward primer. 5'-TTGAAACTGGAACTCTGAGAAGG 

Reverse primer 5-TGGTGGATGGTGTGAAGCA 

Anchor probe: 5-LC Red 640- 
CCTTTCTCCAACTTCTTCTCCATTTCCACC-p 

Sensor probe: 5'-GGGGATCATGTCGTCAATGGACT-FluoTesoein 
ERCC1e4 

Forward primer S-AGGACCACAGGACACGCAGA-S* 

Reverse primer: 5-CATAGAACAGTCCAGAACAC-3' 

Anchor probe: 5-LCRed640-TGGCGACGTAATTCCCGACTATGTGCTG p- 

3' 

Sensor probe: ff-CGCAACGTGCCCTGGGAAT-Fluorescein 
FOSBe* 

Forward primer: ff-AGGCTCAACAAGGAAAAATGC 

Reverse primer 5-GCTAGACAGTCAAGGAGGGACG 

Anchor probe: 5'-LCRed 640-AAAGGGTGGGTGTGGGAGACATTGG-p 

Sensor probe: 5-AAACCAACCTAGGCACCCCAAA-Fluorescein 
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GLTSCR1 e1 

Forward primer 5'-CGACGAACTTCTCTGAAGCGAA 
Reverse primer 5 -AGCGACACG6GCATCTGG 
Anchor probe: 5-ATGAGCGTCCACCTCCTGAACOflUOr^scein 
Sensor probe: S'-LCRed 640-AGGCAGCAGCATCGTCATCCCC-p 
UG1e6 

Forward primer: ff-ATGCCCTGTAGGTTCAATGG 
Reverse primer: 5-TGGAGGTC1 1 IAGGGGCTTG 
Anchor probe: S'-GGCIGGTCCXJCGTCTTCTCCTTCC-Ftuoresceln 
Sensor probe: ^-LC Red 640-TCTCTGTTGCCACTTCAGCCTC-p 
RA/i1 

Forward primen 5-TGGCTAACACGGTGAAACC 
Reverse primen S'-GGAATCCAAAGATTCTATGATGG 
Anchor probe: 5'-GGGAGGCGGAGCTTGCAGTGA-Fluoresceln 
Sensor probe: 5M_CRed 640-CTGAGATCGCACCACTGCAC-p 
SLCIASeA 

Forward primen 5-CAGTGTCCAAAGAGCACC 
Reverse primen ff-CTACCCCTTTAGCGACC 
Anchor probe; 5'-LCRed 640-TCCTGCCCCCAGAGCGTCACC-p 
Sensor probe: S^TACGGTCCACATAATTTTGGAGGA-Fluorescein 
XPD e10 

Forward primen S'-GATCAAAGAGACAGACGAGG 
Reverse primer: 5-GAAGCCCAGGAAATGC 
Anchor probe: 5 T -GGACGCCCACCTGGCCAACC-Fluorescein 
Sensor probe: 5'-LCRed640-CGTGCTGCCCAACGAAGTG-p 



Tabfe 8. Primers and restriction enzymes used for typing or SNPs using PCR-RFLP 
Genoexon % Primers Enzyme Digested 

Fragments 

XRCC1 exonIO TTGTGCTTTCTCTGTOTCCA Mspl 240 t 375bp(A) 
TATCAGAAAAGGCTGGAGQA 615bp(G) 

BRCC1 exon4 AGGACCACAGGACACGCAGA BsrOt 157, 368bp (A); 

CATAGAACAGTCCAGAACAC 525bp(G) 

XPD exanB 1.set CACACCTGGCTC A 1 1 1 1 I G TAT TR\ 
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TCATCCAGGTTOTAGATGCCA 

2.set TQ GAGTGCTATG GC ACGATCTCT Tff\ 56. 114, 4B2 bp (A); 

CCATGGGCATCAAATTCCTGGGA 56, 596 bp (C) 

XPD exo*23 1.set GTCCTGCGCTCAGCAAAGAGAA 
TTCTCCTG CGATTAAAGGCTGT 

ATCCTGTCCCTACTGGCCATTC Pstl 66, 100, 158 (C>; 

TGTGAACGTGACAGTGAGAAAT 100.224(A) 



Tabfe 9. Desfgn of primers and SNaPshot primers for SNaPshot typing on 

sequenaton 

XRCC1 exon7 " ~ ~ 

Forward primer 5-GTCCCATAGATAGGAGTGAAAG 

Reverse primer 5'-CCCTAGGAGACAGGAGCACA 

SNaPshot primen 5-TGCATAGCTAGGTCCTGC 
XRCC1 exon17 

Forward primer: S'-GCCAAGCAGAAGAGACAAA 

Reverse primer: 5-GAGTGGCTGGGGAGTAGGA 

SNaPshot primen 

5-AACTGACRAAACTAGCTCTATGGGGTGGTGCCGCA 
RAl exonS 

Forward primen S^CTACCACCATCATCACATCC 
Reverse primen 5*-GCCTTGCCAAAAATCATAACC 
SNaPshot primer S-CGTCTCCCCAATTAAGTGCCTTCACACAGC 
XPD intron4 

Forward primer: 5-CGCAAAAACTTGTGTATTCACC 
Reverse primer: 5-CCCAl 1 1 1 I ATCATCAGCAACC 
SNaPshot primer ff-CTGGCTCTGAAACTTACTAGCCC 



Table 10. Design of primers and probes for Taqman. 

XRCC1 exonIO 

Forward primen 5'-GCT GGA CTG TCA CCG CAT G 

Reverse Primen S'-GGA GCA GGG TTG GCG TG 

Probe (A): 5'Fam- TGC OCT CCC AGA GGT AAG GCC T -Tamna 
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Probe (G): 5'Vic - COG TCC CGG AGG TAA QGC CTC -Tamra 



Determination of polymorphisms by Ughtcycler. Genotypes of the American per- 
sons for polymorphisms in ASE-1e1, CKMeS, ERCC1e4, FOSBe4, GLTSCR1 el, 
LlG1e6, RAlh. SLC1A5e8 and XPDelO and of the Danish persons for polymor- 

5 phisms ASE-1e1. CKMe8, FOSBe4 f UG1e6 and SLC1A5e8 were detected using 
LightCycler™ (Roche Molecular Biochemicats. Mannheim, Germany). PGR was 
performed by rapid-cycling In a reaction volume of 20 yd with 0.5 \itA of each primer, 
0.045 |iM of anchor and sensor probe, 3.5 mM MgClz, approximately 7 - 25 ng ge- 
nomic DNA, and 2 \d LightCycler DMA Master Hybridization probe buffer (Roche 

10 1 Molecular BiOChemicals, Cat No 2158 825). This buffer contains Taq DNA polymer- 
ase, dNTP mix, and 10 mM MgCb. in some cases the reaction mixture also con- 
tained 5 % DMSO. The temperature cycling consisted of denaturation at 95 °C for 2 
sec. followed by 46 cycles consisting of 2 sec at 95 °C, 10 sec at 57 °C. and 30 sec 
at 72 °C. The last annealing period at 72 *C was extended to 120 sec. The melting 

15 profile was determined by a temperature ramp from 50 *C to 95 °C with a rate of 0.1 
degree/sec. For RAft2 we ran the melting profile 3 times and used the last curve. 

PCR-RFLP analyses. Genotypes of the American persons for polymorphisms in 
XPDe6 and XPDe23 and of Danish psoriatics for polymorphisms in XRCC1e10, 
20 ERCC1e4, XPDeQ, and XPDe23 were detected using PCR-RFLP technique (Shen 
et al see above; Dybdahl et al .see above; Vogel et al, Canoer Epidemiol. Biomark- 
ers Prev., 8:77-81 (2001)). The reactions were performed as reported (Shen et al 
see above: Dybdahl et al .see above; Vogel et al. Cancer Epidemiol. Biomarkers 
Prev., 8:77-81 (2001)).. 

25 

Determination of polymorphisms by SNaPshot technique on sequenator. The poly- 
morphisms In RAleO, XPDW. XRCC1e7, and XRCC1el7 in the American persons 
were typed simultaneously on en ABl Prism 310 sequenator (Applied Biosystems, 
Foster City, CA. USA) using SNaPshot technique (Undblad-Toh et al. Nature Ge- 
30 netics, 24: 381-6, 2000.). The PCR reaction consisted of 1pl purified genomic DNA, 
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1 pmole of each primer (DNA Technology. Aarhus Denmark), 1Z5 nmole of each 
• dNTP (Bioline, London, UK), 100 nmole MgCI 2 (BIoIine) # 0.15 pi B10TAQ™DNA 
Polymerase (Bioline) in a total volume of 20 pi water. The program consisted of 4 
m!n at 96 °C B followed by 25 cycles of 96 °C for 30 sec 60 °C for 30 sec. and 72 °C , 
for 60 sec. The last cycle was followed by 72 °C for 6 min. The primers and dNTPs 
were removed in reactions containing 2 U Shrimp Alkaline Phosphatase (SAP) 
(Roche), 2 U Exonuclease I (Biolabs, Denmark), and 9 pi PCR reaction in a total 
volume of 14 pi water. The reactions were incubated at 37 °C for 60 min and 72 °C 
for 15 min. The SNaPshot reactions contained 1pl SNaPshot Ready Reaction Mbc 
(Applied Biosystems). 0,5pi of each SNaPshot primers (XRCCe7-ss1; 4pmol/pl. 
XPDI5-cp1; 0.5pmo[/pl. RA1e7-cp1; 1pmoI/pl; XRCCe17-§s1; 2pmol/pl), 2 pi of the 
purified PCR product, and 1.5 p) buffer {200 mM Tris-HCI. 5 mM MgCfe, pH 9.0). 
The reactions were cycled 25 times: 96 °C for 1 0s, 50 °C for 5s, and 60 W C for 30s. 
The primers and dNTPs were removed in a reaction containing 1 U SAP, 0.8pl 
10xSAP buffer, and 5pl SNaPshot reaction in a total volume of 8 pi water. Two pi 
purified product was added to 10 pi concentrated deionized form amide (Amresco, 
Ohio, USA), incubated 5 min at 95°C. and analyzed on the sequenator. The two 
markers in XRGC1, in exon 7 and exon 17, could not be reliably scored and thus • 
were excluded from further consideration. 

Determination of polymorphisms by real-time PCR using Taqman probes. The 
polymorphism in XRCClelO In the American persons was analysed using the AB! 
Prism 7700 sequence detection system (Applied Biosystems, Foster City, Ca, USA). 
PCR Primers and Taqman probes were designed using Primer Express, v 1.0 (Ap- 
plied Biosystems). The reactions were performed in MicroAmp optical tubes sealed 
with MicroAmp optical caps (Applied Biosystems) containing a 10pL reaction vol- 
ume: 1x Taqman buffer A,. 2.5mM MgCfe, 200 \iM each of dATP dCTP, dGTP, 
400pM dUTP, 800nM each primer, 200nm each probe, 0,01 U/ pL AmpErase UNO, 
0,025 U/pL AmpliTaq Gold Polymerase. Thermal cycler conditions were: Tubes 
were incubated at 50°C for 2 min followed 10 min at 95°C. The incubation was suc- 
ceeded by 45 cycles of 95°C for 15 sec and 64*C for 1min. 
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Example 1. 

DNA from humans from the American cohort of patients with basal cell carcinoma 
and controls, described in Materials and Methods, were typed with respect to a 
5 number of sequence polymorphisms located in and around the claimed region r. The 
resulting statistical p-values for association of occurrence of the individual sequence 
polymorphisms with the status of patients are depicted in Figure 2. Also depicted are' 
the calculated odds ratios for association of sequence polymorphism and disease. 
For the calculation of the odds ratios the heterozygote genotypes were combined 
10 with the lesser group of homozygotes, and the ordering of the groups was chosen 
such that the odds ratio became more than or equal to 1. The results show that the 
0 sequence polymorphism RAM is strongly associated with disease in this cohort (p = 

0. 004). Bonferroni correction for the number of tests made Indicates that a result 
less than 0.007 must be considered significant at a level of 0.05. Thus, even after 

15 correction for multiplicity of testing this result is significant 

The numbers next to the points in the curves are merely a help to Identify the single 
sequence polymorphisms; 

1, Xiie10; 2, CKMe8; 3. XPDe23; 4, XPDe10; 5. XPDe6; 6, XPDI4; 7, RAle6: 8. 
RAlit; 9, ASE-1e3; 10, ERCC1e4; 11, FOSBe4; 12, SLClA5e6; 13, GLTSCR1e1; 

20 14,LIG1e6. 



Example 2 

Those persons in example 1, who got basal cell carcinoma before the age of 50 
25 years, were selected, and the results from analysis of RAII1 were compared the 
status of the patients. There was a strong relationship between the occurrence of 
the Individual genotypes of the sequence polymorphism and the status of the pa- 
tients (Table 7; Odds ratio = 12.3; pfc 2 ) = 0.00014). 



30 



Table 7. Occurrences of genotype for the sequence polymorphism RAI i1 in Ameri- 
can with Basal cell carcinoma occurring before 50 years of age and in controls. 

RAH1 genotypes Number of cases before 50 years of Number of controls 
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Example 3 

5 The data of Example 2 were combined with results of genotyping the neighbouring 
sequenoe polymorphism RAle6. There was a very strong association between the 
combined genotypes of RAM and RA1e6 and the status of the patients. Thus, al- 
most ali American cases occurring before the age of 50 yrs were homozygote for 
RAI i1 A RAI e6 A , while only approximately half of the controls were so (Table 8, 

10 Odds ratio * 12.8; p(x z ) 3 0.00006). 

Table 8. Combined occurrences of different genotypes for the sequence polymor- 
phisms RAII2 and RAIe7 in American cases occurring before 50 years of age and in 
controls. 

15 





RAM 


RAIe6 


AA 


AG 


G6 


BCC cases 


AA 


30 


0 


0 




AT 


0 


2 


0 




TT 


0 


0 


0 


Controls 


AA 


42 


10 


1 




AT 


2 


21 


0 




TT 


1 


0 


2 



Example 4 

20 The data of Example 2 were combined with results of genotyping the sequence 
polymorphism GLTSCR1e1 located outside the claimed region r. There was a very 
strong association between the combined genotypes of RAM and GLTSCR1e1 and 
the status of the patients. It was obvious to define "risk-genotypes" as having two As 
in RAM and at least one C in GLTSCRiei. This corresponds to the assumptions 
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that RA1i1 A Is recessive, and GLTSCR1e1 c is dominant If one does so, one finds 
that 25 out of 25 cases have a 'risK-genotype". while only 28 out of 62 controls have 
one (Table 9: Odds ratio > 30; p(jt 2 ) - 0.000002). 

5 Table 9. Combined occurrences of genotypes for the sequence polymorphisms 
RAM and GLTSCR1e1 in American cases of basal cell carcinoma occurring before 



50 years of age and In controls. 





RAM 


GLTSCRIel 


AA 


AG 


GG 


BCC cases 


CC 


17 


0 


0 




CT 


8 


0 


0 




TT 


0 


0 ' 


0 


Controls 


CC 


15 


18 


3 




CT 


13 


7 


0 




TT 


3 


3 


O 



10 

Example 5 

DNA from humans from the cohort of Danish psoriatics with basal cell carcinoma 
and controls, described in Materials and Methods, were typed with respect to a 
15 number of sequence polymorphisms located In and around the claimed region r. The 
resulting statistical p-values for association of occurrence of the individual sequence 
polymorphisms with the status of patients are depicted in Figure 3. The results show 
that the sequence polymorphism ERCC1e4 is strongly associated with disease in 
this cohort (p = 0.01). 

20 

Example 6. 

Blood samples were collected from a large number of Danish citizens and frozen. 
After a number of years those women, who got breast cancer in the intervening pe- 
25 riod. were identified, as well as a set of matching controls. DNAs were purified from 
the blood samples of these persons and a number of polymorphisms, namely RAIil, 
ASE-1e3 and ERCC1e4, in the region of interest were typed. The polymorphisms 
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were subsequently combined such that the high-risk group was homozygous for the 
high-risk alleles of all three polymorphisms: RAIII^ASE-lea^ERCCleA 00 . All 
other genotypes were combined Into the low-risk group (Table 10; OR = 1.59; pCx 2 ) 
= 0.004). 

Table 10- Occurrence of a combined "high-risk" genotype RAIH^ASE- 
1e3 6G ERCC1e4 GG as opposed to all other combinations of genotypes for the se- 
quence polymorphisms RAM, ASE-e3 and ERCC1e4 In Danish cases of breast 
cancer and controls. 





High-risk 


Low-risk 


Cases 
Controls 


120 
277 


85 
312 



The DNAs in these examples were purified from available frozen lymphocytes using 
15 Puregene (Gentra Systems). A variety of other ways of purifying DNA Is available to 
the expert and would also be expected to lead to the wanted results. 

Analysis of sequence polymorphisms can be performed with a variety of techniques, 
some of which have been used in the examples of this application. Most often a 
20 number of techniques can produce the wanted result 

Similarly, the choice of primers and probes in a particular assay is to some extend 
free and other primers and probes might well produce similar results. 

25 Finally, it is to be expected that assays for other sequence polymorphisms in the 
region of interest may produce roughly similar results. Our particular choice of se- 
quence polymorphisms and assays used in the examples are thus not intended to 
limit our claims. Thus, at present about 30 SNPs within the region r are listed in 
NCBIs database dbSNP including rs#2070830 ( rs#2017104, rs#2017154 and 

30 rs#2377328, all within or very close to RAl. Other forms of polymorphisms such as 
the tandem repeat polymorphisms D19S543 and D19S393 are also known to occur 
in the region and can probably serve as markers in the present invention. Moreover, 
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it is very likely that the region contains a number of as yet undiscovered polymor- 
phisms. For Instance, me sequence of the 5' half of RAI and Its upstream promoter 
region Is currently only -a draft version and new polymorphisms of potential use for 
this invention are likely to be uncovered as more sequence reads of this segment 
S are produced. 

Sequence of the r region of chromosome 19 

The following deprcts the region r stretching from the beginning of, but not including 
10 the XPD gene, to approximately the end of ERCC1* and includes the genes RAI, 
LOC1 62978, and ASE-1 . More specifically r is bounded by and includes the follow- 
ing two sequences: AGAACCCCCG CCCCTCCACC TCGTCTCAAA and 
TCCCTCCCCA GAGACTGCAC CAGCGCAGCC, and is defined by SEQ ID NO: 1 
herein below: 

15 

AGAACCCCCGCCCCTCCACCTCGTCTCAAAAAAAAAAAAAAATCGTCTCAGTAGCGA- ' 
ATAGTCTAACGGAGAATGACAGGGAAATTGGTGATCCTTTCTGGGCCCAAGAGTTA- 
. GAAATGGCTTTGCAGGCCGGGCGCGGTGGCTCAAGCCTGTAATCCCAGCACTTTGG- 
GAGGCTGAGGCAGGTGGATCACCTGAGGTCGGGAGTTCAAGACCAGCCTGACCAA- 

20 CATGGAGAAAACCTGTCTCTACTAAAGATACAAAATTAGCCGGGCGTGCTGGCAAATG- 
CTTGTAATCCCAGCTACTCGGGAGGCTGAAGCAGGAGAATTGCTTGAACCTGGGAGG- 
CAGAGGTTGCAGTGAGCAGAGATGGCGCCGTCGCACTCTAGCCTGGGCAACAAAAG- 
CGAAACTCCATTTCAAATATTAATAATAATAACTAATAAATAAAACATAAATGCTAGCTT- 
TTGTTTGTTTCTTCAACAAATAGCTATGTGGCATCTACCATGTGTCTGATCCTGTGCT- 

25 G GCCCCTG GGAACAGAAAGGTGACCATGACAGCCTCAGCACCTGCCCTCAAAGAACA- 
GAI 1 I I 1 1 IGCTTGAGACAGGGTCTTTCTCTGTCGCCAAGGCTGGAGTGCAGTGGCA- 
CAGTCACAGCTCACTGCAGCCTCCACCTCTTGGGCTCAAGCGATCCTCCCACCTCAG- 
. CTTCCAGAGTAGCTGGGACCAC^GGTGTGCACCACCAAGCCCAGCTAAGTTTTATTTT- 
TTAAATTTTTTTAGAGACGAGGTCTCACCACGTTGCCCAGGCTGGTTAAACTCGCAG- 

30 GTTCAAGTGATCCTCTCCCCTCAGGCTTTCAAATTGTTGGGATTACAGGGGTGAGG- 

CACCAGGCCTGGCCTCAAAGAACAGATATTAAATATACAAATGAATATATGATTACAGC^ 
CTGGAGTGGTGGCTCGTGCCTGTGGTTCCAACACTTTGGAAGGCCAAGGCGAGTA- 
CATTGCTTGAGCTCAGGAGCTAGAGACCAGCCTGGGCAACATGGTGAAAACCCGTC- 
TCTACAAAAAATGCAAAAATTAGCTGGGCGTGGTGGCGTGCACCTGTAGTCCCAGA- 

35 TACTCAGGAGGCTGAGGTGGGAGAATCACCTGGGCCTGGGAGGCAGAGGTTGCAAT- 
GGGCAGTGATTGTGCCACTGCACTCCAGCCTGGGCAACAGGAGTGAAAACCTATCT- 
GAAATGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGCGGACGTGTATAATCACAAGTA- 
CAAAAGTGCTGTGAAGGAAAACTTCAAGTCACCATAAAGATTGATTATGGGCTGGGTG- 
CAGTGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCAGATGGAT- 

40 CACGAGGTCAGGAGTTCAAGACCAGCCTGGTCAACATGGTGAAACCCTATCTCTAC- 

TAAAAAAAAAAAAAAAAAAAAAAAAGCCAGGCATAGTGGCATGCATCTGTAATCCCATC- 
TACTCGGGAGGCTAAAGCAGGAGAATTGCTTGAACCCAGGAGGCAGAAGTGAGCCAA- 
GATCACGCCACTGCACTCCAGCCTGCGTGACAGAGCAAGACTCCGTCCCAGAAAAA. • 
GAAAAAAAAAAAAGACTTATTATGACAGGATGTCTACTGTCAACTGTGGGGTGTGAGT- 

45 GTTGGCCAAGTGATCAGAGAAGGCTTCGTGGAAGAAGCGAGGTTTGAGTAGAGCCA- 
GAAAATAATTAGAAGAGATCAACCAGCAAGAGGGGATGGATGAGAGAAGTGAGAAAG- 
GTGTTCCAGGGAGAGAGACCATCATACACAAAAGCTCTAGGCCAGAAGAAAGCT- 
GAGGCCTGTGAGTGCTGAAAGGAAGCCTGTGGGGGTGGAGCTCTGAGTTGAGCA- 
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CAGGGAGCAGAGAAAGGGCAGCTGGA GGQQM GGCAGGGGCAGATCGAAATCTCTT- 
TTTTAAATTAATTAATTCTTAATTTATTTATTTTT^ 

CAGACTGGAGTACAGTQGCACAATCTCAGCGCACCGCAACCTCTGCCACCCAGGCT- 
CAAGCAATTCTCTGGCCTCAGCCTCCCTAGTAGCTGGGATTACAGGTGCGCACCAC- * 
5 TACTGCCCAGCTAA 1 1 1 I I ATACTTTTAGTAGAAACGGGGTTTCACT ATGTTGGCCAGG- 
CTGGCCTQAAACTCCTGACCTCAAAAGATCC^CC CACTT CAGCCTCCCAAAGTGCTG* 
GGATTACAGGTGTGAGCC ACCCTTCCCGGCTGTA I I 1 I 1 GGAGACAGAGTCTTGCTCT-. 
GTOCCAGCCTGGAGTATGGTGGTGTGAATTTGGCTCATTGCCACCTTGACCTCCAG- 
GGCTCAAGTGATCCTCCCACCTCAGCCTCCTGAGTAGCTGGGACTGCGGGTACACGA- 

10 CACCACGCCTQGTTA A i rTTTTTTAATTTl ITG TAGAGACGAGGGTATCTCACTATGTT- 
GTCCAGGCTGGTTGAACTCCTGAGCTCAAGCAATTCTCCCACCTCAGCCTCC- 
CAAAGTGGTGGGATTACAGACGTGAGCCACTGTGCCCGGCTTAATTTATTTACATAA- 
Al 1 1 1 1 1 I ATGTTTACTTTTCTATCTCCTACAGGAAGAAAATATATTTTGTTATTGACAG- 
GGTCTCGCTATGTTGCCCAGGCTGGTATTGGGCTCAAGCGATCCTGTTCCCTCAGCC- 

1 5 TCCCAAAGTACTGGGATTACAAGCGTGAGCCTCTGCATCCAGCCCAGATCCAAAATCT* 
TTACTGTCACCTACAGAGTCCTCTGTAACTAGCTTACTGCTCATCATCCGCATACCAAC- 
CCACCTTACTGCTCTGATCTCCTCCTCTCTGTCCCCCAGCTCATTTTGTTTCAGCTATG- 
CTGGTTCXrCCTTGCTGTCTCTAAAACATAACAAGCACATCCCATCTCAGGGCCTTTG- 
CACCAGCTATTITGTCTGOCTGGAATGCTGTTTCCCCTGATAGCCATGTGGCTGACA- 

20 CACTCAGCTCCCTCAGCTCTTTGCTCAATTGTCAACTTCTCGGCCCGGCATGGTGGCT- 

cacacctgtaatcctaccactttgggaggctgaggtgggcagatcacctgagatcag- 
gagttcgagaccagcctggccaagatggtgaaatcccgtctctactaaaaatacaaaa. 
attggcaaagcatggtagcacataccagtaatcctagctacccgggaggctgagg- 
caggagaattgctggaacccgggaggcagaggctgcagtgagccaagatcatgc- - 
25 cactgtactccagcctgggtgacaaagcaagactctgtctcaaaaaaaaaaaagtctc- 
cttctcaatgagggcttcctgaccaccaaattaaatctacctcctagacacacacaga- 
cacgcacgcacgcacgcacacacacacacgcacgcacgcacacacacacacacaca- 

CACACTATATCCCCTTTCCCTGCTTTATTGTTCn^ 

CATGCTGAATATTTTACTTATTTATTTTGTTTAGAAAGCTCCTGGCTGGGCGC^ 

30 TCACGCCTGTAATCCCAGCACTTTGGGAGGCTGGAACAGGTGGATCATGTGAGGT- 
CAGGAGTTCGAGACCAGCCTGACCAACACGGTGAAACCTCATCTCTATTAAAAATG- 
CAAAAATTAGCTGGGTGTX3GTGTCGCATGCCTGTAATCCCAACTACTCAGAAGGCT- 
GAAGCAGGAGAATCGCTTGAACCTGGGAGGCAGAGGTTAACGCTGAGCCGAGATCG- 
CGCCA7TGCACTCCAGCCTGGGCAACAAGAGTGAAACTCTGTCTCGAAAAAAA- 

35 CAAAAGTCAGCTCCATGGCAGGAGTGATGGCTCACGCCTATAATCCCAGCACTTTGT- 
GAGGCCGAGGCGGGCGGATCACTTGAGGTCAGGAGTTGGAGACCAGCCTGGCCAA* 
CATGGTGAAACCTCATCTCTAGTAAAAATACAAAAATTAGCCGGGCGTGGTGACACAT- 
GTCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCTGGAGAATGGCTTGAACCTGG- 
GAGGTAGAGGTTGCAGTAAGCCAAGATCGCGCCATTGCTCTCCATCCTGGGCAACA- 

40 G ACTCC GTCTCAG AAAGGAAGAAAG AAGG AAAG AGAGAAAG AG AGAAAGAGAC AG A- 
G AGAG AGAGAGAAAGG GAGAAAGAGAG AAAGG ATGG AAG G ACCCTG ACAAG C ACT- 
GTTGCATAAAAGTTTCTTTTCTCTCTG \ \ I > I N I I I J 1 I i \ ' I ITIT I 1 GAG ACAGGGTO 
TCACTTCTGTTGCTCCAGCTGAAGTGCAGTGGTGAGAACATGGCTCAGTGCAGCCT- 
CAACTTCCCAGGCTTAAGTGATCCTGCCACCTCAGCCTCCTGAGTAGCTGGGACTG- 

45 TAGGTGTGC ACCACCGTGCCT AGCT A A I I NI IGT A1 1 1 1 IA GTAGAGACATGGTTCCG- 
CCACGTTGCCCAGGCTGGTCTTGAACTCCTGGGCTTAAGGGATCTGCCCGCCATGGC- 
CTCCCAAAGTGCTGGGATTACCAGCGTGAGCCACTGTACCCAGCCTGAGTATAGGTT- 
TCTGATAAATTTTAGGATCATATTGTTTGGACTGGGTAAQAATTTCCAGAA 
GAAGAAACTGACTGGTTTATATrrTATTTTATTTTAT^ I I I I GAGATGGATTT- 

50 TCACTCTTGTTGCCCAAGCTGGATTGCAGTGGCACGATCTTGGCTCACCACAACCTC- 
CGCCTCCCGGTTTCAAGTGATTCTCCTGCCTCAGCCTCCGCAGGAGCTGGGATTAr 
CAGGCACCCACC ACCATGCTCGGCT A I II Ml i 1 II I A TT rTTTTATTTI IA GTAGA- 
GACGGGGTTTCACCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATC- 
CACCTGCCTTGGCCTCCCAAAGCGCTGGGATTACAGGCATGAGCCACTGTGCAAGGC- 

55 CTAGGCTGGTTTATAAAATTGCTAAACCAAGCAGAACATGAATTAAATACCAAGGAA- 
ATACTCTCCTAGATTGTCATGTTACATCAGCCAATACTAAAATTGTCAAGATACACAAT- 
TTGAATGAACTCCATGGTCCAAGTCGAATTATCTATGAtATTACCCATCTAATAAACAG- 
CACTATGTCCCTTAATGGGAGAAAAAGTTGGAGAATTTAAGAGAATATCAATCCAAT- 
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6TTGGTTGGGTGCAGTGAATCATGTCTATATTCCCAGCACTTTQGGAGGCCAAGG- 
CAGGAGGATCAC7TGAGCCCAGGAATTCAAGGCCAGCCTCGGCAACACGGTGAGATC- 
CTGTCTCTACGGAAAATTAAAAAAAAAAAAAGAGAGAGATTAGTGGGATGTGGTGCC- 
TATAGTCCCAGCTACHTGGGAGGCTGAGGCGGGAGGATCATTTAAGCCTGGGACGTT- 

CACTAGAGGAAAAAAAAACTAAAGTGGGGTTTGCGGGTAGTGGGAGGGCCCTTCCTG* 

CTAGGTTGCACTATGATCTCCAGGGAGGCTCCACGGGAGAATCATTreCTTGTCTTTT- 

TCAGTTTCTAGAGCCAAATTCTTTGCATACCTTGCATTCCTTGGCTCGGAACCCCTTCC- 

CTAACCTTCAAAGCTG3CAGCTAGGCTCTGGCTCAAGTGTCACATGGCCTGTCTCT- 

GTCTTCCTATCCAATCTTCCTCTTATAAGAACATTGGAGCCAGGCATGGTGGCTGACG- 

CCTGTAATCCCAGCACTTTGGGAGACCGAG6CAGGCGGATCACAAGGTCAGGAGT- 

TCGAGACCAGCCTGGCCAACACAGTGAAACCCCGTCTCTACTAAAAAAATA- 

CAAAAAAGTAGCCGGGCATGGTGGCAGGTGCCTGTAATCCCAGCTACTTGAGAGGCT- 

GAGGCAGGAGAATCGCTTGAACCTGGGAGGCAGAGCTTGCAGTGAGCCGAGATAGT- 

GCCAATGCAGTCCGGCCTGGGCGAAACAGCGAGACTCCGTCGCAAAAAAAAAAAA- 

ATAATAATAA ATAA TAAATAAAAATAAAAATAAAATAAAAAAATAAAAATAATAAAATAA- 

ATAAAAATTATTTTGAGACAAAGTCTATTCTGTGGCAGAGGCTGGAATGCAGTGGCGT- 

GATCACAGCTTACTGCAGCrrTCTACCTCCTGAGCTCAAGCGATCCTTCCACCTTGGCT- 

TCCTGAGTAGCTGGGACGTCAGGTGTACATTACCACGCTCAGCTAATTATTTATTTATT- " 

TATTAT A l I 1 1 i GTGACGGAGTTTCGCTCTTGTTGCCCGGGCTGGAGTGCAATGGTGC- 

TATCTCAGCTCACTGCAACCTCTGCCTCCTGGATTCCAGTGATTCTCCTGTCTCAGCT- 

TCCTGAGTAGCTGGGATTACAGQTACATGCCATCACGCGCAGCTA Ai I i I iGTATTTT- 

TAGTAGAGACGGGGTTTCATCATATTOGTCAGGCTGGTCTCGAACTCCTGACCTCAG- 

GTGATCCACCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGGGACCACG- 

CCCGGCAA Tf m 1 I \ ITC l Ml 1 1 H i l l I C AGAGAGAGTCTTGCTCTGTCACCCAGGC- 

TGGAGTGCAGTAGCGTGATCTCGGTTTACTGCAACCTCCATGTCCCGGGTTCAAG- 

CGATTCTC CTTTCT CA GCCTC CCAAGTAGCTGGGACTACAGGTGCACACCACCACGG> 

CGGGCTAAI I 1 1 1 GTATTTTTAGTAGAC ACCAGGTTTCACCATATTGGTCAGACTGGTG- 

TCAAACTCCTGACCTCAGGTGATCCATCTGCCrrCAGCCTCCCAAATTGCTGGGATrA. 

CAAGCGTGAGCCACACACCTGGCTTA A 1 1 1 1 1 1 I A TTTTTGATCGACACAGGGTCTCCC- • 

TATGTTGTCCAAGCTGGCAGAGAI 1 1 1 1 GTTrGTTTGTTTGAGAGGGAATTTTGCTCTT- 

GTAGCCCAGGCTGGAGTACAATGGTGCAATCTTGGCTGACCACAACTTCCGCCTCC- 

TCGGTTTAACAGATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGAACTACAGGCACC- 

TACCACCACACCAGGCTAATTTTTGTGCTTTTTAGTAGAGATGAGGTTTCACCATGTT- 

GGCCAGGCTGGTCTTAAACTCCTGGCCTCCAGTGATCCACCCGCCTTGACCTCC- 

CAAAGTGCTGAAATTAOAGGCGTGAGCACCGCGCCTGGCCTCTCAACCTACAATTT- 

GAACACCCAAGGAAACAGCCCACCATGAGTGAGAACCAGCAGACACAACAAACTA- 

TAGGATTAGCTGCCTCCAAACTTCAGGTGATAGATTATCAGGCATGTACTTGAAAC- 

TAAAGGACACAAAAGAAGAATCCGAAATATAAAATAAAGGATTGGACTTGTGTGAAAAr 

GAATCCCTTAGAAAGGGCTACTTTCAGGCTGGCCATGGTGGCTAATGGCCTGTAATC- 

COAGCACTTTGGAAGGCCGAGGTGTGTGGATCACCTGAGGTCAAGAGTTCAAGAC- 

CAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTGAAAATAGAAAAATTAGCCAGGT- 

GGGGTGGCAGATGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATCGC- 

TTGAACTCAGGAGGCAGAGGTTGCAGTGAGCTGAGATTGCGCTATCGTGCCCCAGCC- 

TGGG CACTAGAGTG AGATC AAAAAAAAAAAAAAAAAAAGAAG AAGAAGAAGAAAGGGC- 

TACTTTCAGACTGCCTTGCCAAAAATCATAACCACAATGATGAGCATGTATTGAGT- 

CAAAACAGAATCAAAAGAGAAGAAAGTCAATTTCTGTGCAAACTACTTTTATTTATAAG- 

GAAAGTTTCTCTATTTTGTTTATAAACATTAAACCAGTGCTGTGTGAAGGCACT^ 

GGGGAGAGGTGGGGCAGGGATCCTGGTAGAGAGCAATGTTrCCCACCCAGACCC. 

CAAGACTGCTGGGAGAGATGGTGTCAGCAGTGACTCCCAGGAATATCCAGTGGTGTG- 

GTGGCCCATCCCAGGCCCGGCTGGGCAGGTGGCTGGCTTGCTGGGGGATGTGAT- 

GATGGTGGTAGGCATGGGAGGCACTTTGGACGGGATCTGATTTGGCAAAAGGAAGTG-- 

GTTTCCTGTCCCCAGTGATTTCCAGCCCTTCCGAGACCTCCCAAGGCTAAGGCAGAT- 

TACTAAATTTAAGGCTGGGGCCCTCCTTCTTCCCTGGACTTCCAGGAGAACAGAGAAC- 

CGGTGGCAAGGACCACCACCAGCAGGGTGAGGGGTGCAGATAAAGGCAGCAAAAAAr 

CAGAGGGAGAGGTCTGGAGGGAAGGCAGGAATGCTTGTTTCTGTCAGCCTCAGAAAC- 

CTCCTTCTATCCTGCTAGACT7TACTCCT7TGAGGCTTCACCCTGGGGAACAGCTGGG- 

GAGAGACAGGATCTTCAGACATCAGGAGCTCCCACCTCCTCATCCCACATGCAAATG- 
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cx3ct6cctgtctctatcctcccaccccttcctaaggggacctctcagcacctcc- 
caaactgctccagaatccaagttctgtgtcacctccaagaaccagatggaaccttcca- 
atcagagcctccactqatqaaatggaatattt ccagt gtctcctaactgocataagqa- 
gaagccc acctctctct aacaccttggttgtctttttgggtcccacctccatatt- 

5 taaaaaatctcctctctcagggccgggagcagtgggtcacacctataatcccagcagt- 
ttgggaggccgaggtgggtggatgacctgagctcaggagttcaagacaagcctggt- 
caacatgacgagaccctgtctctactaaaaacacaaaaaattagctgggcgtggtggt- 
gcatgcccgtaatcccagctacttgggaggctgaggcaggagaatcacttgaatccg- 
ggaggtggaggctgcagtgagcgaagatcgcgccactgcactccagcctggg- 

1 0 cgacgcagctgaagctgtgtctccaaaaacaaaacagacacacacacacacaca* 

gaaaaaaaaaaccaaaataaaaaaatctcccttctcaggaatgtaacggaatcttcctt- 
gccttctcccctaaccctaatagagaattttcctcagttacactgtaattttattaatg- 
gatttttcctcattctgcccaatgcagtgtaatgaaagcttcctctccatctgttata 
tatatataaatatatattatatatttatatatt^ 

15 tattgtc^cccaggctggagtgcagtggcacgatcagggctcactggaggatcaatc- 
tccgaggcttaagcgattctcctgtgtcagcctcctg atgag ctgggattacaqq- 
cacccgccaccacacccggctaaci I 1 1 1 I 1 1 1 1 igtatttttagtagagatggagttt- 
caccatgttggcgaggctggtctagaactcctgacctxjaggagatccgcccgcctt- 
ggcctcccaaagtgctgggattacaggtgtgagccacctggccgggocctccact" 

20 tccttcttgtacattoctgaatccctgtgtcagccctagaggtccagtcnrtttgccctc^ 
tcccagccttaatctacaattctgtaacccacccaccatcattaaaatgagattcttct- 
ttgtcgcttcccttggctaaaatggattattctttaacctctccaccaatacaaccagg- 
gatgataataaaaagattggattgagcagaaaccaatcaaataactagtaaggcagtac- 
. . tggcgagcaccctacatcctgacagctttataaagggcgcttccagccaggtgcggt- - 

25 ggcacatgcctgtaatcccaggactttgggaggctgaggcgggcaggtcacctgag- • 
gtcaggagttcaagaccagcctggccaacgtgatgaaaccctgtctagacaaaata- 
caaaaaaaaaaaaaaaattagccgtgcgtggtggcatgcgcctgtcatcccagctac- * * 
tctggaggccaaggagggaggatcacttgagcccgggaggcagaggttgcagtgag- 
cccacatcttatcactgcactccagtctgggtgacaaagcaagactccatctcaa- 

30 ataaataaatagaaattggccgggtgcggtggctcatgcctgtaatcccagcactttg. 
ggagaccaaggcaggtggatcatttgaggtcagtagatcaaaaccagcctggccaa- • 
catggtgaaaccccgtctctactaaaaatacaaaaagtagccgggcgtggtggtggt- - 
gggcgcctgtaatcccaggcaggagaactggttgagcocgggtggggggggcc- 
cgaggttgcagtgagcacagatggcgccattgcactccagcctgggcgacagag- 

35 cgagactccgtttcagaaataaataaataaaataaaaataaaaataaaaaaataatagaa- 
atttaaaaataaaataaagggcttrtcctcacctactccactaagtataagggaccct 
tacccccgacattacta7taaatataacggacttttcgtctcctccccatgagcaata- 
atgagcttttcagacctccctctcccaatataacggtttgttcctgttgcctcttcttt- 
ttcctgtgggatcccccttttccccaacccccaactgtcgggaggtccccatgacttc- 

40 tcccctgggctcaccccgaagtagttccgcggcacgtagccctcctggccgtgcag- 
cgcggcccaccaccagtcggtctcctccggcccgtccctgcgcagcacggtgac- 
cgactcgccctcgcggaaggacagctcgtccccgaactcggcgctgtagtcccaga- 
gagcgtacactgccccgctgtrcatcagccccatactctgctcgacgtctgaaacat- 
gccacggaggggaaggtgagagcctggcccagggggtccaggaagaggggo 

45 cacgtggggtccaggacagaccctggaatttggcgcctgtcccagcaaccacctgaa- 
atgttgtgtgtgcccatggctgtggatgggaaccggagctggagtcagatgccgg- 
gactggccgtctttgagcgttcgaggaaactgggggaggcatgccagtgggccacc- 
cactcccgaggcagggtcagaggctcccattt o ittctttctttttttttttnti i ga- 
gacagagtctcgctctgtcgcccaggctggagtgcagtggcacgatctcggctcac- 

50 tgoaacctccgcctcccgggttcacaccattctcctgcctcagcctcccgagtagct- 
ggg actacaqqcgcccgccaccacgcctggcta a i i 1 1 i ggt ai i i \ ia gtagagt- 
cagggtttcaccgtgttagccaggat6gtctcgatctcctgaccttgtgatccgcc- 

CACATTGGCCTCX: CAAA GTGCTGGGATTACAGGCGTGAGCCACCGCGCCCOGCCTTT- 
I 1 1 1 1 1 II I 1 1 1 1 1 I I M 1 1 GAGATGGAATTTCGCTCTTGTCGCCC AGGCAGGAGTGCA- 
55 ATGGTGCGGTCTCACTGCAACCTCCGCCTCCGGAGTTCGAGCCATTCTCCTGCCT- 

CAGCCTTCCAAGTAGCTGGGATTACAGGTGTGCGCCACCATGCCTGGCCAATTTTTG- 
TATCTTTAGTAGAGACGGGGTTTCACCATGTTGGTCAGGCTGGTATCAAACTCCTGAC- 
CTCAAGTGATCCACCCGCCTCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGO 
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cacctggcccggccctcatttccttcttgtacattgctgaatgcocgtgtcaacccta- 
gaggtccagtcttttgccctaccctggcgcttagcttaagtggtac^gtctctaa^ 

GAAGATTCGCACXTITCfCrrGAATGATAGGGTCCrrrTAAGTTGGCTCATCT 
TTTTCTTTT CI rTTCTTTTCTTTl I GGAGACGGAGTCTTGCTCTGTCGCCCAGGCTG- 
GAGTGCAGTGGCGCGATTTCGGCTCACTGCAACCTCCGCCTCCTGGGTTCCAGCAAT- 
TCTCCTGCCTCAGCCTCCAAAGTAGCTGGGaCTACAGGCCCACGCCGCTACACCCGG- 

ctaaattgttrratatttttaatagagacggggtttcaccgtgttg 

ggaaatcctgagctcatgcaatccgccgg cctc gagc ctcccaaagtgctaqq atta- 

caggcatgagccaccgcgcctggctttctttttci 1 1 ic i i 1 1 c i i ii i i i i i ttcaga- 

caaggtctcactctgccacccaggctgcgggagtgcagtggtgagatcaagcttact- 

gcagcctcgaacttccagattcaagcaatcctcctgcctcagcctcctcctgattctt-- 

tatgttattattaaatattttgtaggccgggcacagtggctcacacctataatcacag- 

cactttgggaggccaaggcaggcggatcctctgaggtcaggggtttgagaccagcc- 

tggccaacatggcaaaaccccgtctctactaaaaatacaaaaaaaaaaaaaaaaaaagt- 

tagcgggccgtggggcccttgcctgtaatcccagttactcgggagcctgaggcag- 

gagaatcgctttcaccgaggaggcagaggttgtagtgggctatggtgccattgcac- 

tccagcctgggtgacagagcaagactctgtctcaaaaaataaataaataaaaataa- 

ataaatatttcgtagaggtcaggtgtggtggctcacacctgaatcttagcactttgg- ' 

GAGGCCAAGGTGGGCAGATTGCCTGAGCTCAAGAGTTCGGGACCAGCCTGGGCAA- 
CACTGCAAAACCCCTTCTGTACTAAAAATACAAAAAAATGAGTCGGGCATGGTGGT- 
GAGCACCTGTAGTCCCAGCTACTCAAGAGGCTGAGGCAGAGAATTGCTTGMTCCAG- 
GAGGTGGAGGTTGCAGTGAGCCGAGATTGAGCCACTGCACTCCAGCCTGGGTGA- 
CAGTGAGACTCTGTCTCAAAAATAATAATAAATAAATATTTGTAGAGACAGGGGGTCTC- 
TACAATGTCTTGTAGCCTGACCAGGCTCACCTTTCAAATATATAACCCTCTGTCTCACC- ' 
CATAAGTCCTAGGACCTGCCTCACTCCAACTCTCCGTGAAGTTCCTTGCCCACACCGA- * 
GATACAACTGGCTCCTCCAGGTGTGAAATGACX5CTGTGCACAATCCCCGTGGCACAG- 
CCTACTTCGCCCTGCCCGTCGGGGAACCAGGTGATGTAGCCTGCCCCCTGGAGAGA- 
TAGGGTACAGCCTTGTGTCTTCCTACAAGCCCCTTTCTGGCAGCTGTAGCCTGCTCAC- • 
CTGCCAGTGGTGTGGCAATGCCTCTCCCACAAGTGGCAGAGCCCACCTGCCCAGAG- 
* CCCT ATGCCAGGTAGATGGCAGGGTTGAAACGTTCAGCTCCTCACCCTTGAAGATGT- 
GAAAGGTGAGCAGAGCAATCTTCACAGCCACTCTCCTCCCGAAAGGTGTCCAGCTCG- 
CATAGCAC AGCCTCCATGTGCC C 1 1 1" I CCCTTAGGAGGGCATAGTCCCCCCACCCO 
CGCAAGCGGTCCATCCCTCATCCTCCTCCTCGGCAATCCTGCCAAGTGGTTGGTAr 
CAGCCCCCATACCCTTCTCTCCCTAGTAGGGGGTAGTTGCTCCCCTCCCCGCTCCTG- 
CGCACCCGCCAGGTACCCAGGCGCCAGCAGCCCTGCCTCGCACCTGCCAGGTAGGT- 
GGCGCAGTCAGCATAACCCTCGCGGTAAGGGTCGCACTTCTCGAAGGCGGTGGCGC- 
CGTCGCTGAGCGTGGTGGCGAAGATTGCAGCGCCGTGCTGCACCAGCGCCATGCA- 
GATGACTGTGTCGTTGCACGACGCCGCGCAGTGCAAGGGTGTCCTAGGCGTGGGG- 
GTGGGGGGTTGCGGGGAACGATGCGTGAGAGGCTGCGCGTCCGCCCACGGGGGAC- 
CCAGCCCACCGCGCGGGTCGGGGCTCACCAGCCGTGGCTGTC6GGGGAGTTGA- 
CATTGGCACCCGCGGTGATGAGGAAATCCACGATAGAGTAGTTGGCGCCGCAGAT- 
GGCGTTGTGCAAGGCAGTGATGCCCTCCTCGTrGGGCTGGCTCGGGTCGTTCATCT- 
GAGTGCACCGGGGGAGGGGGAAGACTCAGTCCCGCGGCTGGCATCTGCGATGCCC- 
CCGCCGTGCCCACCTCCCGCTCAGCAGCGCTCACCTCCTTCACCGCCTGCTGCAO 
CACCTCCAGCTCCCCGGTCAGCGCCGCGTCCAGGAGGAGCACCAGAGGGTTGAGG- 
CGCGCGCGGCGGGCCTTGCGCGGGGAGCCCGCCTTCCGCAGCACAGAGCGCATC- 
TCCTGGGGGACAGGGCGCAGAGGTCAGCGACTTGGAGGGATTGTTAGTATATCCAT- 
GATCTAGAGTAGGAAACAGAGGTCCAGGGACTTGTGGCACCCATCTAGACAGGGGTA- 
GAACTGGGATTCCCTCGGGATGGGGTGAGGGGGTGCCTTCGATCTCCTCCTAGAGCC- 
TCCAGTTCCCTGCCATAGACAGGGAATCCTGTGATTTGAGAATCTTGGGCCOTGAAAC- 
TTGGGAGAAAGCTGGGGGGCCATGGGATTGGTGGCAAAGTAATTCTATCAGTT- 
CAAAACAATGATTGTGGAAGCCAGTTATGCAATTCACACACAGTCTCACATTTCTTTT- . 
GTTAATAATGAATGCAATGAGACACACATGACAAAATGTTACCAGGAGTGTTCATTG- 
CGGATGTTTGGAATTTGAGCATTTTATTATTCCTTGTATTTTCC I I HON I I ICTCTTT- 
iTTTlTTTTTTn JG AGATGGAGTCTCGCTCTGTCACCCAGGCTGGAGTGCAGTG- 
CAGTGGTGTGATCTCAGCTCACTGCACCCTCCATCCCCCAGGTTCAAGCAATTCTCCT- 
GCCTCAGCCTCCTGAGTAGCTAGGATTACAGGCATGCGCCACTATGCCTGGCTAATTT- 
TCATAI I 1 1 1 AGTAGAGACAGGGTTTTGTCATGTTGTCCAGGCTGGTCTCGAACTCCT- 
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GACCTCAGGTGATCCACCCACCTCA6CCTCCCAAA6TGCTAGGATTACAGGTGTGAG- 
CCACTGTGCCCAGCCTCATGGGCTTTCTTATTT7TO 

TATTCTGGGCTGGGCGAGGTGGCTCATGTCTGTAATCCTAGCACTTTGGGAGGCT- 
GAGGTGGGAGGATCACTTGAGCCCAGGAGTTCGAGAACAGCTTGGGCAATATAGTGA- 
5 GACCCAGTCTCTACAAAAAATAAAAAATTAGCCTGACATGGTGGCGCACACC- 

CGTCGTCCCAGOTACTTGGGAGGCTGAGGCAGGAGGATTACTTGAATGGAAGAGAAG- i 
•GAGGCTTCAGTGAGCCATGATCATGCCACTGCACTCTAGCCTGGGCAACAGAGTGAr 
GACCCAGTCTCAAAAGAAAAAAAAATGCATTTATTTATTCCAAGTGTGTGAGTGCATAG- 
CATTTGTGATTCTGGTCTTTGCTGTTTCCAGAGTTT^ 

10 CAGAGATCCOAACAGCCACTGAATTCAAAATTCCCAGATGCTCAGTTATTTCAAGTTTC- 
CAATATQTrGTQATTGCAGAAATGCTAGGCTGTGCTATTTCAA ATTGCT GAGGGGC- 
CAGGACTTTGGAATCCAAAGATTCTATGATGGAGAACTTTAATATTTTTCTGTTAGAATT- 
T CTTTTTT1 1 GTTG G I lll ll I Q AQACAGAGTCTCGCTCTGTCGCCCAGGCTQGAQTQ- 
CAGTGGTGCGATCrrCAGCTCACTGCAAGCTCCGCCTCCCGGGTTCAGGCCATTCTCC- 

15 TGCCTCAGCCTGCCAAGTAGCTGGGACTACGGGCGCCCGCCACCACGCCTGGCTATT- 
TTGT AI I 1 1 t A GTAAAGATGGGQTTTCACCGTGTTAGCCAGGAAGGTCTTGTTCTCCT- 
GACCTCGTGATCCGCCCACCTCGGCCTCCCAAAGTGCTGGGATTACAGGTGTGAGC 
CATCATGCCTGACCTAGAATTTCATTTTAAAAGACTAGAAGGAAATGGCTGGGTGCG- 
GTGGCTCATGTGTGTAATCTCAGCACTTTGGGAGGCTGAGGAGAGTGGATCACCT- 

20 GAGGTCAGGCAGGAGTTCAAGACCAGCCTGGCCAACGTGGTGAAACCCTGTCTCTAC 
TAAAAATACAAAAATTAGGTGGCCGTGGTGGTGCACGCCTGTAATCCCAGCTACTCAG. 
"GAGGCCGTGGCATGAGMTCACTTGAACCCAGGAGGCACAGTTATAGTGAGCTGA- 
GATGGCACCATCGCACTCCAGCCTGGGTGACAGAGT 

GAAAAAAAAAAGAAAGACTAGAAGGAAATATTCAAAATGTTAATGATGGTTCCCTGT- • 

25 GAGTGGTGTG ATTTTGTCCTC I > IO I ICTAI I i I I A TTTATTTTCCCCAAGCTCTCTATG^ 

GTGTTTGGTGTATrTCTCTATAGTGGAATGTGTAAATTTAAAGTATAAATCTCAGCTGGG* - 
CACAGTGGCTCATGCCTGGTTTGAGACCAGCCTGGACAACATAATGAGAACTGTCTC-' . 
TACTGAAAATGTTAAATATTATCTGGGAGTGGTGGTGCATGCCTGTAGTCCCAGC- 
CATAGGGGAGGCTGAGGCATGAGGATCAATTGAGCCCAGTAGGTGGAGGCTGCAGT- 

30 GAGCCATGATCTTGGCACTGCACTCCAGCCTGGGCAACAGAGTGAG ACTCT GTC" 

TCGATAATAATAACCCTCTATTACAACATATCAGTGCATGAA7TTGTGATTTTATAATT- 
CAAAATATGAGCATCTTTAATTGTCAGATTTGGTGACTTCAAGAATCAGTAATAAT- 
CAGTCTATG ATACTAACTTTATAATTA I 1 1 1 1 1 I I AAGAGAAGAGTTTCCTTTTATTT- 
TATTTTATTTGAGACAGAGnrTTCTCTCTGTTGCCCAGGCTGGAGTGGAGTGGCGCA- 
. 35 ATCTCGGCTCACTGCAGCCTCTGTCTCCTAGGTTCAAGCAATTCTCCTGCCTGAGCC-- 
TCCCGAGTAGCTGGGATTACAGGCATGCACCACCAGGCCCAGCTAATTTTTGTATTTT- 
TAGCAGAGACGGGGTTTCACCATGTTGGCGAGGCTAGTCTTGAACTCCTGACCT- 
CAAGTGATCCACCCGCCTCGGCCTCCCAAGGTGCTGGGATTACAGGCATGAGCCAC- 
. CGTGCCCAGCCTAACTnTATAATTCTAAGATCGTGTTCAAACCTTTAAATGCTCTAGGG- 

40 CTCTAAAATGTTACTATCCTAAGACGGTGACACTAGCGTTTGATTCTTACATT 

TTTTTAAGTTTCTCTGTGGCCAGGACTCTGTGATTCTACAATGGGATGCTCAGCCATTT- 
CAACATGTTGTTATTCATCCCCTCTTGATTTCAAAATCCTGAGCCTCAAGGTTCCTTGC- 
CTTTACTTTCAGGAGGGCCTAGGAATAGGCATTTrGGGGGGGTCCACCTGACCCCTG- 
CTTCTCTG AG AAGTGATCTCTTCCCG CTGTCT ACG C ACACG GAGTGTTC AGG ACTGT- 

45 TCCATGTGGCTACAACCCTCTTCCCAGTCAAGATGCAGGGACCAAGATCAGCAGGA- 

GACCATCCCCTGGTCCAATGGTGACAACAGTAAGAGCAGTTAACAGTTATGTGCCAGG- 
TATTATGCTAAGCACTACATTAATGTATTTAATCTTGGCGGGGTGTGGTGGCTCACACC- 
TGTAATCCCAGCACTTTGGGAGGCCAGGGCGGGCAGATCACTTGAGGTCAGGAGTT- 
CAAGACCAGCCTAGCCAACACAGTGAAACCCCATCTCTACTAAAAATACAAAAATTAG- 

50 CCAAGCGTGGTGGCATATGCCTGTAATCCCAGCCACTTGGGAGACTGACGCAGGAGA. 
ATCACTTTAACCCAGGAGGTGGAGTCCAGCACCCAGCCGAGACTCACTTGTTTTTATT- 
TATTTATTTATTT AiT I 1TA1 rTTTATTTTTn I GAGACGGAATCTTGCTCTGTCACC- 
CAGGCTGGAGTGCAGTGGCGCGATGTCAGCTCACCACAAGCTCCGCCTCCCGGGCT- 
CACGCCATTCTCX5TCTCAGCCTCCAGAGTAGCTGGGACTACAGGCGCCCGCCACCAC- 

55 CCCCAGCTA A1 I I I I GT AI.J 1 1 1A GTAGAGACGQGQTTTCACCGTQTTAGCCAGGATG- 
GTCTTATCTCCTGAC7TCGTGATCCGCQCGCCTCGGCCTCCCAAAA 
CAGGCATGAACCACCACGCCCGGCCrTATTTArrTATTTATTTAGAGATG^ 
TCTGTCGCCCAGGCTGGAGTGCAGTGGTGCAGTCTTGGCTCACTGGAACCTCCGCCT- 
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TCCGGGTTTAAGCGATTCTCTTGOCTCAGCCT 
GACCACCACTTCTCCTGTTGTCCTTCCCAGCTTC^^ 

TTATAAGACAGGAAAAAAAGGGAGAAAGCAAAACGCTGGAAAAAAACAGAAGTACGA- 
TAAATAGCTAGATGACCTTGGCGCCACCATCTGGTCCTGGTGGTTAAAATAATAATA- 
5 ATAATATTAATCCCTGACCAAAACTACTGGTGTTATCTGTAAATTCCAGACATTGTAT- 
GAGAAAGCACTGTAAAACGTTTTGTTCTGTTAGCTGATGTCTGTAGCCCCCAGT- 
CACGTTCCTCACGCTTACTTGATCTATCGTGGCCCTTTCACGTGGACCCCTTAGCGTT- 
GTAAGCCCTTAAAAGTGCTAGGAATTTCTTTTTCGGGGAGCTCGGCTCnTAAGA CGCT *- 
GATGCTCCCGGCCGAATAAAAAGCTCTTCCTTCTTTAATCCGGTGTCTGAGGAGTTTT- 
10 GTCTGTGGCTCGTCCTGCTACAGAATTACAGGCACGCGCCACCGCTCCGGGCTAATT- 

tttgt ai rrm ia gtagacagggggtttcaccatgttggtcaggctggacttgaacc- 
tctgacctcatgatccacccacctcggcctcccaaagtgctgggattacaggcgt- 
gagccaccgcgcx^cggccgagactcactattttataagaggagagagcaaagccag- 
gaacagtggctcatgcctctaactgcagcaatttgggaggctgaggcagqtggat- 

1 s catttgaagtcaggagtttgagatoagcctggccagcatggtgaaac^tgatctctac- 
taaaaatacaaaaattagccaggagtggtggcatacacttataatcccagctacttgg- 
gaagctaaagcgggaggatggcttgaacctgggaggcggaggttgcagtgagc- 
cgaggtcaagccactgcactccagcctgagtgatggagcaagactctgcctg- 
gaaaaaaaaaaaaaatagaggagagagcagagcagacacaagagacacagagacaga- 

20 gagggagagaagagagggtgactgctttgattcaggcaagacttctcagtcccaga- 
atgaacccactgttgtgccaagactcagtcatgtccaggtgtatgactcgagattgct- 
gaaggaatgcccggggcaggggacaggcacaggttattggagagaaggagcaga- 
gaacatctctatgtggccaagactcccagatggccctccatatagtcacacacagc- 
tatcctaaagactacatttcccagcatcccattgcaatgaggctcctggccagtgg- 

25 gagcaggcagagtgatgtatggaactcccaggttctgcctgaaacaggaaagggcac- 
tttctcttcttcittctctcttcctggctggagggcagacttggtgacagccatctag- 
gaccatgaaggcaggcttactccccgatggatggcagagccccaggtagatagagcc- 
tgggtcctgactccagtgaggtgcctacagtcctgggctgcaaactcttggacttc- 
tactcaaaagaggagaaaacttcgatctcatct^aagccactatatttggggggctct- 

30 ttgctacagctcctggattcatgtagcaaacataccccggtttcctcctgtattact- 
taccatgctctgcggctgctctggtgggctgctctgggacggggccgggggtgga- > 
atgggagctggtggggcaggagcagggggccctgccctggcctcagatccctcagt- 
gatgggggacagctctggctccggccccocgggccctggccccccatgacgatg- 
gaagaggcggctgatgatctgctggtactgtttcttgtgggtaggggggagggccar 

35 cagcaggggcctgctccatggagcccctgcgtttgaggggccggggaatttccgc- 
caacacccgtgccacctcctccagctcgggcaccgactgtgcctccggtggcagtg- 
ctggctgcagcctcgtggggctgagaggccttgctacagggccttcatccacatcg- 
ccagcctccagcactggtgtcagcagcccctctatctccggctcaggctccagctcg- 
gtggggggtttggggggtcctagccggaacaagagcccatcagaggacaggtccc- 

40 caggagacacccaacactccctctccacaacttccagggcatacaaccaggacatgat- 
tttctgtgtgacctcagggaagttccttgccctctctgggctacactttccttgggct- 
gtgaataatatacaattatgatgcctcccatttattgagcagttagtatgtgcctggcg- 
ctttacatgcctaccttattgtaatctcaccactgctttgtgagqtagatacactgc- 
" catctgcacattaccgaaagggaatctgggcctcagagaggacaagtcagttgcc- 

45 caaagccatgcagttgggacttgaactcagttctggctgactctagaatctacttc- 
taccaaccgtgatagatgtgattttctgagatcctgagagtttcctctcctaacatct- 
caggcagaaaactccagcaggaagtagaatcctggtgtttaatgatttcttctctgtct- 
tactcattctgacagtaaagcaggtggaaataaaaatatgcattattggct- 
gagtcgagtggctcacacctgtaatcccagaactttgggaggccgaggcaggca- 

50 gatctctrgagatcaggagtttgagagcagcctggccaacatggtaaaaccctgtctc- 
tactaaaaatacaaaaaaaaaaaaaaaaaaaaaaaaattagctgggcgtggtggcacat- 
gcctgtaatcccagctactcggaaggctgaggcacaggaatcgcttgaacccag- 
gaggcggaggttgcagtgagccgagattgcaccactgcaccactgcactccagcct- 
gggcaaaagagtgagatttcatctcaaaatatatatatatacacacacacacacaaaca- 

55 cacacacacattatatatatagtgtatatatatttttatatagtatgcatatacatataa- . 
ataatacacacacacacacacggctgagcatggtggctcatgcctgtaatcccagcac- 
tttgggaggctgaggtgggtggatcacctgaggtcaggggttcgagaccagcctgg- 
ccaacatggcaaaacctcatctctactaaaaacacaaaaaattagttgggtgtggtg- 
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GTGCATGCCTGTAACCCCAOCTACTTGGGAAGCTGAGGTAGGAGAATCGCTTGAACC* 
TGGGAGGTGTAGGATGGAGTGAGCTGAAACCTCACCACTGCATTCCAGCCT6GGCAA- 
GAAGAGTGAAACTCCATCTTGGCTGGGCACGGTGGTTCACGCCTGTAATCCCAGCAC- 
TTTG6GAGGCCGAGGTGGGCAGATCATGAGGTCAGGAGATCGAGACQATCCTGGC- 
5 TAACATGATGAAACCCCGTCTCTACTAAAAATACAAAAATTAGCTGGGGGTGGTGGTG- 

ggcgcctgtagtcccagccactcgggaggctgaggcaggagaatggcgtgaacccg^ 
gg aggcggagcttgcagtgagcaagc accactgcactccaacctggaag aaagag- * * 
cgagactctgtctcaaaaaaaaagagtgaaactctgtctcaaaaataaataaataa- 
ataaaccccaaaacacacacacatacacattatttcattgaatccccgtcacaattctar 

10 tagggtagatattattaatctctcttcacagacgggaaacagagtttcggacaagtaat- 
ttatcttcagtcacacagcaagttagcagtgaagagagactcx:agcccatctgct- 
taactcactgatctcacacctcaaaatattaataaattattataactaatatggtagc- 
tatttatttgagactgggtctcactctgtcacccaggctggagtgcagtggcgctat- 
cacaqctcactgcagcctggatctcccaggcttaaatgatcctc ccacct cagcatcc^ 

1 5 tgagtagctgggactacaggcgcccactaccatgcccggcagai i i ii j gtacttt- 
t al 1 1 1 t agtaaagtctattttagtitcactatgttgcxc^gqctggtcttgaactcca- 
gagctcaagcaatcctgtctgcattagcccaccaaactgctaggattacaagggt- 
gaggcacggtgcctggctaatatggtagctattgatagcttactatgtatcagatcc- 
t attt at7t attt a \ i \ \ i gaqacaqagtctcaccctgtcacctgtgctggagtgcagt- 

20 ggcatgatcttggctcactgccacctccgcctcc ttggctcaa6ctgaqtagc tag- 
gactacagtggtgagccaccatgcccagctaa 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i igata- 
gagatgggatttcatcatgttgtccaggctggtcttgaactcctgacctcaagtgatc- 
tgcccacctcggcctcccaaagtgctgggattacaggtgtgagcaactgcacctggc- 
. ccatcaggtgctgttttaaaggctttatatgaatttaataacatatgtcaatag- 

25 gatcgattctatcattatttgc ci i hi ii i 1 i i 1 h 1 i i i i i i gaggcagagtctccc- - 

CGTCACCCAGGATGGACTGCAGTGGCGCAATCTCGGCTCACTGCAACCTCCACCTCC- . 
CGGGTCCAAGTGATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGGACTACAGGCGCC- . ■ 
CGCCACCATGCCTGGCTA A1 I I 1 I GTATTTTTAGTAGAGATGGGGTTTCATATTGGC*' . 
CAGGCTGGTCTCGAACTTCTGACTTTGTGATCCGCCCGCCTCGGCCTCCCAAAGTGC> 

30 - TGGGATTACAGGCATGAGCCACCGTGCCCGGCCCATTATTTCCCTTTTAGACTCAAr . ' 
GAAAATTGAGGCCCAGTGAGGTTAAGTGACTTGCCCAAGGTCACACAGCGTGGAAO- 
CAGGCAGTCTGGCTTCAGGGTCCACACTTAACCTTTGAGCTATCCCTGGCTCCTACC-* 
CAAATTCCCAAACTCACCTGGCCTAGCTCTCTGCAGGGACAGTGCTTGTAAAGAGG- 
CATTTGGCTGTGATCTCCCCACCTCfcCAGGGCTGGTCTGGTCCCCCTGCCATrrGTCC- 

35 TCCCTTCACCCAGTCCTCTAGGGCCCTCATTGCTGACTCACCTTCGTTCACAGGGGC- 
CATGTCTGTTGGGGATGCTGGGGGGCTGGGGTAGGGGTTTGGGGTTGGGTCTGGGG- 
CTGTGGGGGCAGCTGGGGCTGTGGTTGTGATTGTGGCTGGGGCTGTGGTTGTGGTT- 
GGGGCTGCAGCTTAGGCGGGGGTGCTCGGGTGAAGAGGGGGGACCCAGGGAGCAT- 
GGCGCGGCTGGCCCCGTGCTCCCAGAAGGCGTTCTGCAGCTTGAAGATCATGCT- 

40 GAGGGGGATGGGACGCTGGCGCGGGGCCCCGCGGGGCTGGGGGCTGGAGGGGGG- 
CATGGGGATGCGGCTQACGGGCTGCCAGCTGCGAGGCAAAGTGCCCGACGGCCG- 
CGCGGAGCCCAGCGAGCGCCGGTAGCTGCCCGCGTCTGAACGCCGGTCGCTGGC- 
CAGAGGAGAGACCTTGTAATTGCGCGGCAGGGTGGCGCTAGTGAGGTTGTCCTGGG- 
G AAG AG GG AAG GGAG AAG GG G ATCGGGTG AG AGAGGG AAGGTGGAG GGG AGG- 

45 TAAAGACAAAAGACGAGAAGGGAGAGGAGGTGAGGGAAGCCCTGGGAGTGAGGGA- 
GAAGAAAGGGTGAGGAAGGAGCAGAAACCCAGCACAGTGAAGGGAGAGGGTGG- 
GAACGGGCGCCGAGACCCAQATCGCAGCCCCGAGGGGGAGACTGGCCTTGACCC- 
CGCTCCCCCACCCCAGTCCTCGACCTTCCCCAGCCTCTCCTCCCCAGGCGTCGCCTC- 
CTCACCTTGCCGGTGCCCCCCAGTCCATCCAGGCTGCTCTCCCTCCAAGGCAACAGC- 

50 TGCAGGCTCGGCGAGGCAGGCCTTGCGAAGACGTCCAGGCCTGCGGGGCGGGAAT- 
CATTAGGGTCTGTGGGGCTGCCTCTCCTCCGGGTCCTCCATTCCCCGGGCCTCCAC- 
CACTCACGTTCATAGCTCGCTGTCTGCGAAGGCTTCTTCTCGTACGCCACGTCCAGGT- 
CAGACTCGTTCCAGGCTTTCGGAGGCCGCCGGCGCAGCGTCAGGTCGTCTGGGGA- 
GAAGTTTCCAGGGAGGATGAGACGGGAGGGGTGGCGAGCCCCGGATCCTGCCCGCT- 

55 TTGACCCCGCGAGTCAAAGGCCCCGCGAGGGGCCCCTGGGTTCACCTTGCGCGCG- 
CAGAGGCGGGGCGAATGCGCTGCCGCGGGAGCCTAGCAGGGAGCTCCCGAAGGCG- 
GACGCTGGCGCGTCGTAGGCTGTGGCAGGGGGGCGCGGTGACGGCCCACGCTCGG- 
GGAAGAAGGCCTGGGGCCCCTCCGCCAGGGGGCTGCCGCGGGGGGAGCCTGCG- 
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CGGCCCAGGAAGTCGAAAGGCGTGGGGG6ACCCTGCTGG0GGAGCGGGCCTGGCC- 
CGGGCCGCGGGGAGGGCGCACGGCCGAGGGAGCTGCCTGCGCCATCGAAGGCG- 
CGGGGCCGGGGCGAGGTCGCGCGGTCCAGGCTGCCGTAGGCGTCCGGCTGCAGG- 
TAGAGCGGGGTGCGCGGCGACGACGGCCGTCCCTTGGGGGACAGCGGGCTGTAGG- 
5 GGTGTAGGGTTGGGGCACTCTCTGATCGTCCGAACGGGGTGTCJGCGCCGTCGGT- 
GGCCGCCTTCCGGGGGGACCCTCGGCJGCCGAAGGGCTCAGGGATCGAGCTGGAG- 
CTGTACC6GGGCGGCTGTGGGGAGGCCAGGGCATTGAGGGATGGATCAAAGGAGA- 
CATTAGTGGAAGGGTTGGTGTGTGGGCGGGGGTGTCAAGAGAGATCACTGGAGGT- 

caagccagaggaggctgaccg.gccatggaaattcaggcacagagagcccaggtgag- 
10 tagtggtggggagacagccctgaatcagcactgtggctagcccattactctatgt- 

cacctttatgccacttaggtaaacacctctttcct 

cacttccactggtccgctcttttctatttctrtcmctt^ 

tcrtttctttctttcctctctctccttccntcctttctct 

tccctccctgcttgcttgctttctctctctctc i" ii c \ ttctttctttctttctttctt- 
1 5 tctttctttctttcttttctatctcggctcattggagcctcaacctccct 

gtgatcctcccacttcagcctcccaagtagctgggattacaggtatgcaccacca- 

CACCTGGCrrAACTTTTGTATTTTTAGTAGAGACAGGGTTTCACCATGT^ 

GGTCTTAAACTCCTGACCTCAAGTGATCC GCCTG TCTCTGAAAGTGTTGAGATTA- 

CAGGCGTGAACCACCGTGCCCAGCCAGATTTTTAAAAAATCATTTGTAGAGGCTGGTC- 

20 TCAAACTCTTAGTCTCAAGCAATTCTCTCACCTCGCCTTCCAAAGTGCTGGGATTC- 

CAGGTCTGAGCCATCGCGCCTGGCCTGGTCCCCl I 1 1 1 1 CAAGTTCCCTTGAAGAGC- 
CCACAACCTGCATAACTATATGGGGCAATTTTGCCTGAAATCCAGGCCTCTGGTCTG- 
GACTGTGGCGAGAGGCTGGCTTTGGAGATCAAGGTGGGAACCAGGCTTACCCTA- 
GAAGGGGGTCCGGCXiTGCGGGCCAGGAGGCGCGGGAGAGTCTGACGACAGCGAO 

25 TCCAGCTGCTTGGTCAGTTCATCCACCTTGGCCGCCGCCGTGTCCAGGTCCATCTGC- 
TTCAGATCCATGTGTTTCATGGCCAGCGCTGGGAAGGTGGGAGTGGAGGTAAGGACC- 
TGGCCTCCTGGCAGGGGCCGGCCTCAGCACCCCTCGCCCGCTGCCGAGGTCCCpG- 
CCTCGCCAGCCCCGCCCCCTACTCCAGCTTACACTGGAAGTTCATGTCCAGAAAGTC- 
CCGCGCGCTCTGGAATGCCTCGCTGTCCATGGTGCCGGCCGGAGCGGGCGCCTG- 

30 CATGGTGGGGAGGGAGGGAGCTGGCTAAGACCCCGCCCCTCTAGACCCCGCCCT- 
CAGGGAGTCAGACGCCGTCAGGAGCGGGACAACGCCTCAACTCAGTTCCTTCCCCT- 
GGAAGCC^TTTACCCTTTCACCTCCCCAGCTGGGAAATGCCAACTCCTCCAAAGC- 
CAAGTCCATGCGCCAOGGAGAAGTCCAAACCCAGTCTAAAACCTCCGGAATTCACTT- 
tctctttc i rrm I CTTTT CI IHIHII I l l Mil I GTGTATGTGTGTGAGACA- 

35 gagtctcgctctgtcgcccaggcgggagtgcaatgacgcgatcttggctcactg- 
caacctccgcctcccgggttcaagcaaatcttctgcctagctgggactacaagcgcg- 
cgccattatgcccggctaatttttgtag7tctgggattacaggagtgagtctccgcgc- 
ccggccgtgtccatctctttatctcagtcctaagacctgaatcactccttgaacaat- 
tatctattgatcacctacaatgtgccggtaaacataggatg gaataa ctat gaatta ct- 

40 GAATGTTTACTAGGGACCAGGACGCACTGTGCTAGATCCTGTTTTTGTTTGl I 1 1 iGA- 
GATGGTGTCTCGCATTTTCGCCCAGGCTGGAGTGCAGTGGCGCGATCTCGGCTCACT- 
GCAAGCTCCGCCTCCAGGGTTCATGCCAGTCTCCTGTCTCAGCCTCCCGAGYAGCTG- 
GGACTACAGGCGCCTGCGACCATGCCTGGCTAAATTTTTGTATTTTTAGTAGAGACGG- 
GGTTTCACCGTGTCAGCCAGGATGGTCTCGATCTCCTGACCGCGTGATCCATCTGCC- 

45 TCGGCCTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGGCCCTTGTTT- 
TTGI MM I A ATAATAATTCTGCTGTCTGCrrGTGTACTAQAACCCATGCCTACTGCTTG- 
GGGTATAATGTAGTAAATGTAGTAAAAACAATATCCGCCGGGCGCGGTGGCTCACGO 
CTGTAATTCCAGCACTTTGGGAGGCCAAGGAGGGCGGATCACGAGGTCAGGAGAG- 
CGAGACCATCCTGGCTAACATGGTGAAACCCCGTCTCTACTAAAAATACCAAAAAT- 

50 TAGCCAGGCGTGGTGATGGACGCCTGTAGTCCCAGCTACTCGGGAGGCTGAGGCAG- 
GAGAACGGCGTGAACCCGGGAGGTGGAGCTTGAACTGAGCGGAGATCGCGCCACTG- 
CACTCCAGCCTGGGCGACAGTGCGAGACTCCGTCTTAAAACAAACAAATAAATAA- 
ATATGTTTAAAACAACAACAACAATAACCAGC CAGG C GCGGTG GTTCACTCX5TGTAAC- 
CCGAGCAC7TTGGGAGGCCGAGGTGGATGGATCGCTTGAAGCCAGGAGACCAGCCT- 

55 GGCCAATATGGTGAAACCCCGTCTCTACAAAAAAATACAAAAGTTAGCTGGGCATGGT- 
GGCATGTGCCTGTAATCCCAGCTACTCAGGAGGCTGAGGCACAAGGCTCACTTGAAC- 
CTGGGAGGCACAGGTTGCAGTGAGCATAGATTGTGTCACTGCACTGCAGCTTGGGT- 
GACAGAGCGAGGCTCTATTTAAAAAAAAAAAAATTAATTGAGGGGCCACTCCCTTCTA- 
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gagtgotgagaaatgccgtgcaccgaaagcntcatttgatggtcaaaaccaccctag- 
caggcaagaaagcatggctcagaaacatatgttcaaggtcaccct6caagaagtcgg- 
tagtaatcggtttcacacccgcatct aactt attctgggtcatctctaccagattagag- 
gggtcxxtagagggaagcgactgctcagcttcctttccctagggtccccattcagtg- 
gaggtctggctctcactgacccattgttagcaagaggaacagggaggtggccaggg- 
gtggaggggcagctgtggtcactggcccagtgggagggagctaggccactaggaac- 
cggtcaggccagcaccatccctatccccatgctagccaccacacccaccagctctgc- 
* cagctccctgctgcatcgaccacttagctctggcagtataggcagcagggcaggctg- 
gggcatgctgatacccgcctctgtctgggaagtcgaaggaacagaacctgttcaggc- 
tggcggctcatttggatgaacagggagtgtgtgaccttgggcgttgagtcctctc- 
cactccctgggcctcagtctccccaacatcaaagaagaaggcaaatcac cl u1ihh- 
ttttttgagatagggtctcgctctgtaacccaggctacaattgtgactcactacagcc^ 
tcttgacctcccagctcaagtggtcctcccacctcagcctcctgagtagctgagacta- 
taggtat agcctcgcaccaccacacccagcta ai 1 1 i ii 1 1 1 1 1 1 ui ' m i m ii 1 1 1 1 u 
tttttttgagacggagtcttgctctgtcgcccaggctggagttcagtggcgggatc- 
tcggctcactgcaagctccgcctcccgggttcacgcgattctccggcctcagcctcc- 
caagtagctgggactacaggcgcccgccactacgcccggctaatttttgtattttag- 
tagagacggggtttcaccattttagccgggatggtctcgatctcctgacctcatgato- 
cgcccgcctcggcctcccaaagtgctgggattacaggcgtgagccaccgcgcccgg- ■ 
ccacccagctaattttttaaaaacattttgtacactttgggaggctaaggcgggag^ 
gatcacgaggtcaggagctcgagaccatcotggctaacacaggtgaaaccctgtctc* 
tactaaaaaatacaaaaaaattagctgggcgtggtggcgggcgcctgtagtcccagc- 
tactcgggaggctgaggcaggagmtggtgtgaaccagggaggcggagctttcagt- 
gagccgagatcgcgccactgcactccagcctcggagacagagcgagactccgtcc- 
caaa aaaa aaaaaaaaaaaaattrgtagagacagatcaagtctcactttgttgctcagg-* 
ctggttttgaactcctgggctcaagcaatcctcccgcctcagcctcccaaagtgct- 

GAGATTAGAGGCATGAGCCACCACACGTGGCCAAATCAGCTATTCTGAAAGGCCCCTT- 

TAATCTCTATGAGCCCCAGACTTTCAAACTGTAAGGACCTTAGGACTGTAACTAAAGT- 

TCTACAGAGCCTAAACCCCTCAGCTAAAGAGCCTATTGTTGGAAAGTTCTGAGTCCAA- 

GATTCTATCTTTGGAACATTCTAGAATTCTCCAATTTGTCTAACCCAGAATTCTGAGTCT- • 

TTCTGTACCACATTCTACCTAACCCAGGGTTGCACTGCTCTGGAAGTCTAGATGGATG- 

GTATAGTGCAGCTGGTAAAAGCATGAGTAAGAAGTCAGACTTCAAAAATTCAAATCT- 

GAGGGCCGGGCATGGTAGCTTCTGCCTGTAATCCTTGCACTTTGGGAGGCCGAGGG- 

GGGAGGATCACTTGAGGCCAGGAGTTCAAGACXJAACATGGCCAACACAATGAGACCC- 

CATTTCTTAAAAAAAATTAAAATAAAATCATCAAATCTGGCAGCACCACCGTCCAACCC* 

TGACCACAGTACCTCAGTCTCGTAATCCGTAAAATGGGGATGAAAGTTCACCTCATAG- 

GACTACTGTAAGAATCCACCTGGTCAGAAGGTGCAGGAAGAATTCAGAGCTCTGAGA- 

ATTGAGGCCTCAGGAAGAAGAGACTACAGGAATAAAAACTCGGGCATTtAGAATlTCAr 

GAGATACACAAACAATACTTTGTTAACTGTTAAAATAGATAAATGAGCAAGTCTGTG- 

CAGCCCTAATGCCAGCTGTAAGTGACTCl I I 1 1 1 1 1 I CTTTTGGTAGAGATTTAGTCTC- 

TCTCGCGCCTGTGGTTAGGCTGGTCTCGAACTCCTAGCCTCATGGGATCCTCCCCGG- 

CTCGATCTCCCAAAGTATTGGGATTACAGGCGTGAGCACGGCGCCATGATCCCCAA- 

ATTTCCAAGATTCTCAGATTCCATACTGACATTCTCTGGCTCTCAGGAAATGCCAACCC- 

TGGGTGTGGGGCTGTCGCGGGGACAGGCGGTGGGGACGTCGGAGCCACCAGGGGG- 

CGGTCACGCCCGGACCCCCGCCAGGAGGGCGGACTGCGCCTGAGCTCAGGCCCGG- 

GGAATGCGCAGCGGGCCCGGGCAGGTGCTGTACATCCCGGGGCAAGGGAGCTGGG- 

CCGGGCGGGGTACAAGGGCGGGGCGCGGGGGTGGCGCGGGCCGTGTGTCTGTTCO 

CAGGCCTCTGCCCCTGACCTCTGCCTCCGAGTCCTCTCCCATGTGCTCCCCTCTAGC- 

TCTAGCTCXGAGCTCTCCCGCGGGCTCTGGGCCAGCCGCAGGTACTCTCCCCTGGG- 

CTCCTCTCTCCGCTCCACCCCTGGCTCTCCTTCCCTGGCCTCCTCTGCACCCCAGC- 

CAGG1TC7TTAGGGCTAAGGATCCTGTGGACTTCCTGGAGGAGTCATCTTCAGTAG- 

GAACCGGGTCAGAQAGCCAGACTGAGCTGGGAACACCCAGGCTGGACTCCTACAGC- 

CC TGTC GGGTCACACTGAATGTGGAGAGGCTCCACTGTCTCTGGGACTCGGTTTCC- 

TCCTTTGTGGACGTCTATGGAATGGGCTAGGGCCTTTCTTGCTCTAAGCCTCTACTTG- 

GGCTTGTTATTTAGCTTCTCTGTGCCTGTTrCCTCATGTGGACCATGGGAAGAATTA- 

ATACCTTCGCCTCAAAGGGGTATGAGGATTGAGTGACATAATTTATAAGCCGTGATTA- 

GAACAATGCAGTGCGCGAAATAAAGTTCACACATACAGGATTCATAATTACCAGAT- 

GTCCTTGGCTGTTCATrATAATAACACAGGGTCTGGCAACAGAGTGAGGGGTCCAGAC- 
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TCAATGTA AI 1 1 1 1 U I i C GCCTAAAAGGGCCCTTTCAACTCTTTCTGAGATCATACAAG- 
CCCTGAGTTTTGACACCCAGGGTCTCAACTTCCTGAGCCCTTGCCTCTCAGAGTCC* 
TAAATTTCCCCTGTACATTGCTGAGTCTGGCCAGTGATCACCCTCAGTCACTTAGG- 
> GACGGGAGGGCTGGGAGAGCCCTGGAAGATTCCAGACAGAAGCTGGCAAAAGCC- 
5 GAGGGTGTGGGCAATATCCACTCTCCAGCCTCCGrTTTCTCCACTCGTAATGAG- 

GAGTCCTTCGCTGGGGTCAGCAAACCTTATTCAAAGGGAGACCTCTCAGTCACCCAAr* 
GATTCCTCTAGACAATGCGAGCTTTCCTACCTACCTACCTACC^GCTCTGAGCTTGG- 
TACAC CCAGAG CCCTGTTTTGGC AACC ACGGTTATTATTTTrAATTTC ATTTCAGGT- 
TATCATCAAATGGCCTTCAAGCCCAGACATTGGGAAACACTCCTCTCTCATGAGATGC- 
10 TCGCCrTCCCCCATTCTGTTTTTAATCCCCCTTCTTAGGACGCATGGGGGTTGAGA* 

GAACGGGGAGATAGACAGAGGGAGGTGCCTGGTCCTGCCCTCCCCCCGCCTCAAG- 
GACAGACAGAGACCTCCAGAATTAGCCTCTGTCCCTCCTTATCTCCCACAATACCC- 
CAGGTCAGACAGATGGGCGTGGAGGTGACATTTCTCACCTCAGGGTCAGGGCAAG- 
GAGCCCTGAGGCAGAAGGTTAGTCAGAAAATCTGGCGGGGGCGGATGGAATCC- 
1 5 CGTCCCCGAGAGAGCTGCAGAAGAAGGAGGAGGCAGAATCCTGACCCTACAAACTC- 
TACTGCCTGTGTGAGCTCCAAGCCTGAGTTTACCCGTTCCTCTGCGTGTAATGGTTAA- 
ATGCCCGGCTATGCAAACCTCCCAGAATCCAATAGCCGCTTTCCGGAATTCT6CCCT- 
GGGTTCTAGAACTACCTCTGCAAACCCAGCTGTTTCCCACCCCATAAGGCAATAGGG- 
GAGCCGACCTCCGCGAGGGGGTGCCCTAGGGCGGATGTCCCTTCTCTGGTTAGG- 
20 CAGGTCTGACGCCCAGGTTAATGACATGTTGGGTTCGCTCAGCGGCACAGAGGAG- 
GTTGGAGATCTGCCTCGGTGTTTTCTCTCCTACCCCGCCCCCATCCCCGAGC- 
CGAAAAGTCG GGGGAG AGCCGGGACACAGCCTCCGG AGG G AC CCCGGGTACCT- 
GTCCTGCTCCACTTCAGGAACCCAGGCTCCACTATCCCTGCCCCACXCTTAATTCTGC- 
TCAGAGACCTAGAAGATCGGTCGAGACAGCAGCTTGAGGCTGGCAGGGTGGTCACC- 
25 * CATTCCACCTTGAGCCCCACCAGTCTGAGCCTCTCATTTCTGACCAAGACTCGGGGAT- 
TCGAACCCCTATACTACCCAAAGACTCGGCTTCCTAGAGCCCCCCAGTTCGAGGGAC- 
TCAGGAATTCCAGCTCCAACGTCTCCCCGGGATGAAGGGGTAGAATCCCTCCATTC- - 
CAAGAATTCAGGCATCCGAACCCGCTTTCCTTCCCTCCAGTAAAACAGGCAACGGAGT- 
TTCCTTCTAAGGATCCAGGTGTCGGCGCGCCCCAAATTCCGCCCTGGGACCTGG- 
30' CGTCCGAGTCCCCTCCCAATWTCCCAGGGACGCGGGTGTTGGGCTTTTTCAGGGCC- 
TCTGGTCCCCAGGAGGGTGAAACTCACGGATCCGGGCAGATCCTGGCACCTGGGGG- 
CfTCCTCCAGCTCGGGCTCCGGCTTGGGGAGCGGAGAAGGGGGCGGGGCAGGAGC- 
TGGGAACAGGTTAGACGACGTGACTTGGGCTGGAGGGAGGCGGGTCCCGGTGGG- 
GAGGGGGAGCCAAGGTCGCCTCGAGCACOTTGGGACTTGTAGTCCCGGAGGGACAG- 
35 GACGTAGCCCAAGACGATCCCATTTGGATTCACCCAGAGTCGATTTCACAGACAG- 
GAAGGG CGAGG CCC AG AAGCCG AG AGCGACCAGG CC AGGGAGATAG AG AAG AGC- 
CGAGACGCCTGCCTCGCTGTGGCTGGAGACTGACTCCTGAGCCCTTGCCCCACCCCT- 
TCAGGCGCACTATCCCCTrTCCTGATGAGTATCCCCCAGGGTCfCTGAGCCCGAATC- 
TCCCCGTCGATAAAAAGCGCGGG7TGGATCTTCAAAGGATGTCCCAGCAAGAGTT- 
40 CAAAATCTTAGTTTGGACTACAACCCCCAGCAGCCTCCGCGACCGCCCTCGGGCGAC- 
TCTTTGCCTCGGGTCCTGTGGGAATTGTAGTCCTGGAGCCCGCAGGGCTGCACCC- 
CGGTGTCTCTCTCGCCCACGCGAAGGAAACCGTCTGGAGATCCTGGATAGGGGAAA- 
CATTTCCCCTTCCCCTTGACCCTCCCTCCGCTCTGGAAAGCCTCTCCCACCTGGGGA- 
GAAGGGGTGCCCCAATTCTGGAGTAGGATCCTAAATCTTGGCAGA6GGGGCGG- 
45 GAAGTGGCGCTGACACACTGGCCAGGAATGCAGTCGGGTCACCCTGTCTAGCCAC- 
CGTCTCGCGGCTCCAACCGCCGCCCAACGCGGGGCGGCCCCAGTGGGAAGG- 
GAAGTGGGTGCGTCCCCCAAATCTGTGTCCACGTGCCGCTGTTTACACGCTCCCTGG- 
GGCAGGGAGGAGTCGCCGATCAGGTCCCTTCCTGAAAGTCATCGAGGTTTCCCACG- 
CATGAGACTAAACCCCCGAGGGCATCTACAAGTCCCATTTGATCCACAAACGCTACAC- 
50 CGTGCCPAGCACCACTCCACGCGTGTGGGGCTCCTGGGTCCGAGGCTCCGCCC- 

TCGAGAACCACAAGCTCCTCCCCCTATGTTTCCCGCTCCCCCGGAGTCCAGAAGCCC- 
CGCCCCTGGCrGGAACTTCACGCCCTCCGGACGGATTGCOCCTATTTCTCCATTTTCC- 
CGCTTCTCCCAGTCAAGTTCTGAACTTGTGAGGCATCTGGGCCTCCCCAGAAGACATT- 
TAACACAGAAAGCACAGCCCTACTAACTAGTATTCTTACCTGTCTCTTCAAGAATTTCA- 
55 GACCAATCGACCGTCCTGTCTCTTTAAGGCTTAGGAAGAGCAGTGTGGCTGCCCCTT- 

gaoccgacgggcctctgactccagcaatacagcgaatcagcggctttcgg gaata - 
c at i i i i cggaaaaagacttcttcctcggttttctgctctgcacacgttgaaattttcc- 
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CCAGTTTTTCCTGCAGATCGGGAGTCGAGCAATGCCTACCCCCGCGCTCCC6CAC- 

CAGTTGGGCGCTCCCGGATGATGCCCTACCCCTTTGGATCCACGTGGTCTGCAACCT- 

GGTGCGAGCAGCCCGGGCTACAGGGTTGCCTGAGGTGTGGGTCCCAGGATGGAG- 

GAGCCCCAGGCCGGCGGTGAGGGTGCGGGTTGACGGGGTGCGGAGGGTGCGTTG- 

GTGGAAGGAGAAAGGGGCGTCCGAGAGGGTTCGGGCGGAAAAGGAGGCGTACCTG- 

CAAGCAGGACTTGCGAAGAGCGTGCATTCCCAGTGGGCGAACGGGAATTCGAACG- 

GAGAGAGGGTTATCTTGTGGGGGGCTACCCGTGGAGAGCAAGGCGCCCCCAGGG- 

GTTGGATCGGTGAAATTGAGGTCGCCCCTGGGGAACAGGTGGGCAGAAAGGAr 

GAAACCAGGTTGAGGGGACTGGAGTGCTCACGAGGTTAAGACCAATGGACCGA- 

TAGGCGCGCCCTGCAAGATTGGACCGGCAAGGAGGTGTCAGTCGACCCCATTTCCCC- 

TTCTGCTGCAGATGCTGCTCGGTTCTCTTGTCCCCCCAACTTTACCGCGAAGCCCC- 

CAGCCTC^GAGTCCCCTCGTTTCTCCTTGGAGGCGCTGACGGGTCCAGATACGGAGC- 

TGTGGCTTATTGAGGCCCCTGCAGACTTTGCCCCAGAATGGTGAGTGGTCTTGTT- 

GACGGAAAAGAGGGTCCCGGTCCAGACCCCAAGAGCGGGTTCTTGAATTTGTCACAG- 

GAAAGAATTAGAGGTGAGTCACAGAGCACAGTGAAAGAAACAAGTTTATTGGAAAC- 

TACTCCTTTACAGAiQTAGAGTGTCCTCAGAAAGCAGGGGGAGAAACCCACAGOCCT- 

TTGTTAGTATTTCTACTTATAAGAAACTATAAGGAACTATAGTTAAACTTGGAGTGTG- 

CAGATAAGCTCACTAAAGGTAGGGGCTATTGGTGTTATCCACGACCATTAATCCTG- 

CAACCTAAGCTTGCTCATTTATGTTATATTTAAGTAATGGGGGCTGCATTCTTAGGA- 

CATTTGGACATTGTGCAGGCTrGGTGGAACAT GTTCT GTATGGCCATAAATATrCTGT^ 

ATTATAATTGGTGGTCAGCCTGGGATGTGGTTATTTTCAGGCCATAAGCATGAACCTT- 

GTAAGTGCCTAGCTACTCACTTTAAGATGG^^ 

CAGAGGCCAGCCAGGCGGAGTGGCTGGTGCCTGTAATCCCATCCTTTGGGAGGC- 

CGAGGCGAGCAGATCACTTGAGGTCAGGAGTTCAAGACCAGCCTGGCCAACATAGT- - 

GAAATTGTCTCTACTAAAAATACAAAAATTGGCTGGGCGTGGTGGCAGGTGCCTGTA- 

ATCCCAGCTACTTGAGAGGCTGAGGCAGGAGAATCGCTTGAACCCAGGAGGTGGA- s . 

CATTGGAGTGAGCCGAGATCATGCCACTGCACTCCAGCCTAGGCAACAGAGCAAGAC- 

TCTCTCAAAAAAAAACAAAAAAAAAATCAAAAAAGCTTCCCTCTCCTGTTCCACTTAAG-. 

CCTCTGCCCTCCCnrGTTTCTCTCTGTAGCTTCAATGGGCGGCATGTGCCTCTCTCTGG- * 

CTCCCAGATCGTCAAGGGCAAATTGGCAGGCAAGCGGCACCGCTATCGAGTCCTCAG- 

CAGCTGTCCCCAAGCTGGAGAAGCGACCCTGCTGGCCCCCTCAACGGAGGCAGGAG- 

GTGGACTCACCTGTGCCTCAGCCCCCCAGGGCACCCTAAGGATCCTTGAGGGTCCC- 

CAGCAATCCCTGTCAGGGAGCCCTCTGCAGCCCATCCCAGCAAGTCCCCCACCACA- 

GATCCCTCCTGGCCTGAGGCCTCGGTTCTGTGCCTTTGGGGGCAACCCACCAGTCA- 

CAGGGCCTAGGTCAGCCTTGGCCCCCAACCTGCTCACCTCAGGGAAGAAGAAAAAG- 

GAGATGCAGGTGACAGAGGCCCOAGTCACTCAGGAGGCAGTGAATGGGCACGGGGC- 

CCTGGAGGTGGACATGGCTTTGGGGTCGCCAGAAATGGATGTGCGGAAGAAGAA- 

GAAGAAAAAAAATCAGCAGCTGAAAGAACCAGAGGCAGCAGGGCCTGTGGGGACA- 

GAGCCCACAGTGGAGACACTGGAGCCTCTGGGAGTGCTGTTCCCGTCCACCACCAA- 

GAAGAGG AAG AAGCC C AAAGG G AAAGAAACCTTCG AGCCAGAAG AC AAG ACAGT- 

GAAGCAGGAACAGATTAACACTGAGCCTCTAGAAGACACAGTCCTGTCCCCGAC- 

CAAAAAGAGAAAGAGGCAAAAGGGGACGGAAGGGATGGAGCCAGAGGAGGGGGT- 

GACAGTTGAGTCTCAGCCAGAGGTGAAGGTGGAGCCACTGGAGGAAGCCATCCCTCT- 

GCCCCCTACGAAGAAGAGGAAAAAAGAAAAGGGACAGATGGCAATGATGGAGCCAG- 

GGACGGAGGCGATGGAGCCAGTGGAGCCGGAGATGAAGCCTCTGGAGTCCCCAGG- 

GGGGACCATGGCGCCTCAACAGCCAGAAGGAGCGAAGCCTCAGGCCCAGGCAGCTC- 

TGGCAGCTCCCAAAAAGAAGACGAAGAAAGAAAAACAGCAAGATGCCACAGTGGAGO 

CAGAGACAGAGGTGGTGGGGCCTGAGCTGCCGGATGACCTTGAGCCTCAGGCAGC- 

TCCCACATCCACCAAGAAGAAGAAGAAGAAGAAAGAGAGAGGTCACACAGTGACT- 

GAGCCAATTCAGCCACTAGAGCCTGAACTGCCAGGGGAGGGACAGCCTGAAGCCAG- 

GGCAACTCCGGGATCCACCAAGAAGAGGAAGAAGCAGAGTCAGGAAAGCCGGATGC- 

CAGAGACAGTGCCCCAAGAGGAGATGCCAGGGCCGCCACTGAATTCAGAGTCTGGG- 

GAGGAGGCTCCCACAGGCCGGGACAAGAAGCGGAAGCAGCAGCAGCAGCAGCCT- 

GTGTAGTCTGCCCCCGGGAAACTGAGGAACTAAAGAAAGCTGAAGGTGCCCACCTG- 

GGCCACCAGAAGGTGACACCCCCAGAATCCCTCCCCAGAGACTGCACCAGCGCAGCC 
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1 . A method for estimating the cancer risk of an individual comprising 
S - providing a sample from said individual, 

- assessing in the genetic materia! in said sample a sequenoe polymorphism 

- in a region corresponding to SEQ ID NO: 1. or a part thereof, or 
10 - in a region complementaiy to SEQ ID NO: 1, or a part thereof, or 

- In a transcription product from a sequence In a region corresponding to SEQ 
ID NO: 1, or a part thereof, or 

- or translation product from a sequence in a region corresponding to SEQ ID 
NO: 1, or a part thereof, 

* • \15 - obtaining a sequence polymorphism response, 

- estimating the cancer risk of said individual based on the sequence polymor- 
phism response. 

; 20 . 2. The method according to claim 1 , wherein the cell sample is a blood sample, a 
tissue sample, a sample of secretion, semen, ovum, a washing of a body sur- 
face, such as a buccal swap, a clipping of a body surface, including hairs and 
nails. 

25 3, The method according to any of the preceding claims, wherein the cell is se- 
lected from white blood cells and tumor tissue. 

4. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least one mutation base change. 

30 

5. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least two base changes. 

6. The method according to any of the preceding claims, wherein the sequence 
35 polymorphism comprises at least one single nucleotide polymorphism. 



61 



27/08 02 13:20 FAX +45 3 3 HOI BERG APS „ pJ^^-OG VAREMAI @ 

P657DK00 

59 



7. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least two single nucleotide polymorphisms. 

5 8. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least one tandem repeat polymorphism. 

9. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least two tandem repeat polymorphisms. 

10 

10. The method according to any of the preceding claims, wherein the cancer is se- 
lected firom skin carcinoma Including malignant melanoma, breast cancer, lung 
cancer, colon cancer and other cancers in the gastro-intestinal tract, prostate 
cancer, lymphoma, leukemia, pancreas cancer, head and neck cancer, ovary 

•15 cancer and other gynecological cancers. 

11 . The method according to any of the preceding claims, wherein the cancer is se- 
lected from skin cancer, lung cancer, colon cancer and breast cancer. 

20 12. The method according to any of the preceding claims, wherein the cancer is se- 
lected from skin cancer and breast cancer. 



13. The method according to any of the preceding claims 10-12. wherein the skin 
cancer is basal cell carcinoma. 

25 

14. The method according to any of the preceding claims, wherein the assessment 
is conducted by means of at least one nucleic acid primer or probe, such as a 
primer or probe of DNA, RNA or a nucleic acid analogue such as peptide nucleic 
acid (PNA) or locked nucleic acid (LIMA). 

30 

15. The method according to claim 14, wherein the nucleotide primer or probe is 
capable of hybridising to a subsequence of the region corresponding to SEQ ID 
NO: 1 , or a part thereof, or a region complementary to SEQ ID NO:1. 
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16. The method according to claim 14, wherein the primer or probe has a length of 
at least 9 nucleotide or peptide monomers. 

17. The method according to any of the preceding claims 14-16. wherein at least 
5 one primer or probe is capable of hybridising to a subsequence selected from 

the group of subsequences 

1 . GCTCTGAAAC TTACTAGCCC(A/G)GTATTTATGG AGAGGCATTT 

2. GTGGTC AAAT TCTCATTCAT CGTGG (T/C) CCAGGCAAGC 
10 ACACTTCCTC 

3. ACCCTGAGGT GAGCACCTGT TCCTT(C/T) TCCTTGCCCT TAGCCCA- 
(I GAGGTAGA 

4. GGGCAGGGGT TTGTGCCTCC AATGA (G/A) CACAAGCTCC 
CCCTGCCCCC CAACT 

15 5. CCTGGCGGTG GCCGTCACCA GCTTT (T/C) GGGGGTGTTT 

GGGAAGCTGG 

6. CTCCAGCCCC ACTGTTCCCT (A/G) GGCCCTATTG GTCCCCCTGG 

7. ACAA6GAGGA GGCAGAAGTG AGGTT (G/C) AAACCCACTG CCCAATC- 
•TTA 

20 8. CCAAGACGGT GAAACCCCGT CTGTA(T/C)TAAAAATACA AAAATTAGCC 

9. AATCCAGGAC CCCATAATCT TCCGT (C/T) ATCTAAAACA ATA 
ATGGTGA 

10. CCCAAGGGGG CGAGGGGAGG GTGAA (A/G)GGGTGGGACG 
GGGGCAGCCG 

25 1 1. GAAGTGAGAA GGGGGCTGGG GGTCG(G/-) CGCTCGCTAG 

£ CGGGCGCGGG 

12. CGCACGCGCA GTATCCCGAT TGGCT (C/G)TGCCCTAGCG GATT- 
GACGGG 

1 3. AACTCCTGGG TTCGATGAAT ACTCA (GACA/-) ATCTTGGCAG 
30 GCGCAGGAGG 

14. GCTGGGATTA CAGGCTTGAG CCACC (A/G) CGCCCGGCCT 
GCAAAGCCAT 

15. TTTTGTATCT TTAGTAGAGA CAGG (T/G) TTTCTCCATG TTGGTCAGGC 

16. GCCTCAGCCT CCCGAGTAGC TGAGACT (C/A) CAGGTGCCCG CCAC- 
35 CACGCC 
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17. TGAAATTGTA GGTTGAGAGG CCAGGCG (C/T) GGTGCTCACG 
CCTGTAATTT 

18. GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

19. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 
5 20. GGGAGGCTCG A6GCGGGC (A/G) GATTGCATGA GCTCAGGATT 

21. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

22. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

23. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

24. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 
1 0 25. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCCA 

GCTGTTTCCC 

. 26. GCTGTTTCCC ACCCCATAAGGCA (A/G) TAGGGGAGCC 
CACCTCCGCC 

27. GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAGG 
15 28. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGCC 

ACCGTCTCGC 

• 29. GGGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 

30. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 

31. TAGAAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 
20 32. ACAGQAGAGG GAAGG I 111 I IG (A/T) I 1 1 I 1 1 1 1 I I G I I I 1 1 1 1 I 1 

33. GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA- 
GAAG 

34. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 
CAGCT 

25 35. TTGAGACTCT CTGTTTGAT (A/G) CTTCACTCAG AAGGTGCTTC 

36. AGGCCAGGCT CCTGCTGGCT G (C/G) GCTGGTGCAG TCTCTGGGGA 

37. CCCCTATACC CTCAAGCAT (C/T) TATCCATTGA GTTACAAACA 
, 38. ACCATCCCCC GCCTTCCGTT (A/C) GTCCGGCCCC CGAGGCTAGC 

30 or to a sequence complementary to any of the subsequences. 

18. The method according to claim 1 7, wherein at least one nucleotide probe is se- 
lected from the group consisting of 
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1. TGAAATTGTA GGTTGAGAGG CCAGGCG (Crt) GGTGCTCACG 
' CCTGTAATTT 

2. - GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

3. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 
5 4. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

5. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

6. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

7. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

8. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 
10 9. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCCA 

GCTGTTTCCC 

10. GCTGTTTCCC ACCCCATAAG GCA (A/G) TAGGGGAGCC 
CACCTCCGCC 

1 1. GACCTAGAAG ATCGGTCGAG A (C/T) AGCAGCTTGA GGCTGGCAG6 
15 12. CTGGCCAGG A ATG CAGTCGG GTCAC (C/T) CTGTCTAGCC 

ACCGTCTCGC 

13. GGGAGGAGTC GCCGATCAG6 (C/T) CCCTTCCTGA AAGTCATCGA 

14. GCAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTGT GGGTCCCAGG 

15. TAGAAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 
20 16. ACAGGAGAGG GAAGGI (III IG (A/T) 1 1 I I U I I I I Gl 1 1 1 I I I I I 

17. GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA- 
GAAG 

18. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 
CAGCT 



25 



30 



or to a sequence complementary to any of the subsequences. 

19. The method according to claim 18, wherein at least one nucleotide probe is se- 
lected from the group consisting of 



1 . GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 
. 2. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

3. GGGAGGCTCG AGGCGGGC (A/G) GATTGCATGA GCTCAGGATT 

4. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 
35 5. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 
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or to a sequence complementary to any of the subsequences. 

20. The method according to any of the preceding claims, wherein at least one se- 
quence polymorphism is assessed in a region corresponding to SEQ ID NO: 1 

5 position 1521-37752 fe). 

21 . The method according to any of the preceding claims, wherein at least one se- 
quence polymorphism is assessed in a region corresponding to SEQ ID NO: 1 

. position 7760-22885 (RAI). 

10 

22. The method according to any of the preceding claims, wherein at least one se- 
£ quence polymorphism Is assessed in a region corresponding to SEQ ID NO: 1 

position 34391- 37752. 

1 5 23. The method according to any of the preceding claims, wherein at least two diffe- 
rent probes are used, one probe being selected from the probes as defined in 
any of claims 17-21. and the other probe being capable of hybridising to a se- 
quence different from SEQ ID NO: 1 , or a part thereof, or to a sequence com- 
plementary to a region different from SEQ ID NO: 1. or a part thereof,. 

20 

24. The method according to claim 1, wherein the translational product from a se- 
quence in a region corresponding to SEQ ID NO: 1 , or a part thereof, is an anti- 
body, such as a monoclonal or polyclonal antibody. 

25 25. A method for estimating the cancer prognosis of an individual comprising 

• providing a sample from said individual, 

. assessing in the genetic material In said sample a sequence polymorphism 



30 



in a region corresponding to SEQ ID NO: J, or a part thereof, or 
in a region complementary to SEQ ID NO: 1 , or a part thereof, or 
in a transcription product from a sequence In a region corresponding to 
SEQ ID NO: 1, or a part thereof, or 
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- or translation product from a sequence in a region corresponding to SEQ 

ID NO: 1 , or a part thereof, 
- ' obtaining a sequence polymorphism response, 

5 - estimating the cancer prognosis of said individual based on the sequence - 

polymorphism response. 

26. The method according to claim 25, wherein the method has any of the features 
as defined in any of the claims 2-24. 

27, A method for estimating a treatment response of an individual suffering from 
cancer to a cancer treatment comprising 

- providing a sample from said individual, 

- assessing (n the genetic material in said sample a sequence polymorphism 



10 



15 



~ in a region corresponding to SEQ ID NO: 1 , or a part thereof, or 

- in a region complementary to SEQ iD NO: 1 , or a part thereof, or 

20 - in a transcription product from a sequence in a region corresponding to 

SEQ ID NO: 1 , or a part thereof, or 

- or translation product from a sequence in a region corresponding to SEQ 
ID NO: 1, or a part thereof, 

- obtaining a sequence polymorphism response, 



25 



estimating the individual's response to the cancer treatment based on the 
sequence polymorphism response. 



28. The method according to claim 27, wherein the method has any of the features 
30 as defined in any of the claims 2-24. 

29. A primer or probe for use in a method as defined in any of the claims above, 
said primer or probe being selected from 

35 TGGCTAACACGGTGAAACC(SEQ ID NO:7) 



67 



'02 13:22 FAX +45 3332' 



HOIBERG APS 



PAll^BbG VAREtLC ® 



P637 0KOO 

65 

GGAATCCAAAGATTCTATGATGG(SEQ ID NO:8) 
GGGAGGCGGAGCTTGCAGTGA (SEQ ID NOS) 
CTGAGATCGCACGACTGCAC (SEQ ID NO:10) 
GGTTTTCTGCTCTGCACACG (SEQ ID NO:11) 
5 CCTTTCTCCTTCCACCAACG (SEQ ID NO:12) 

CGGGCTACAGGGTTACCTGAG (SEQ ID NO:13) 
TCTGCAACCTGGTGCGAGCAGC (SEQ ID NO:14) 
CCTACCACCATCATCACATCC (SEQ ID NO:15)' 
GCCTTGCCAAAAATCATAAGC (SEQ ID NO:16) 
10 CCTCTCCCCAATTAAGTGCCTTCACACAGC (SEQ ID NO:1 7) 

AGCCAGGGAGGTTGAGGCT (SEQ ID NO:18) 
AGACAGCCCTGAATCAGCAC (SEQ ID NO: 19) 
GCAATGAGCCGAGATAGAA (SEQ ID NO:20) 
TGGCTAGCCCATTACTCTA (SEQ ID NO:21) 

30. A primer or probe for use in a method as defined in any of the claims above as 
the other probe 

GCCCCGTCCCAGGTA (SEQ ID NO:21) 
20 AG CCCCAAG ACCCTTTCACT (SEQ ID NO:22) 

GTCCCATAGATAGGAGTGAAAG (SEQ ID NO:23) 

CCCTAGGACACAGGAGCACA (SEQ ID NO:24) 

TTGTGCTTTCTCTGTGTCCA (SEQ ID NO:25) 

TATCAGAAAAG GCTGG AGGA (SEQ ID NO:26) 
25 GAGTGGCTGGGGAGTAGGA (SEQ ID NO-.27) 

GCCAAG CAGAAGAGACAAA (SEQ ID NO:28) 

CCTCAGATGTCCTCTGCTCA (SEQ ID NO:29) 

GCCACAGCCCCAGCAAGTAG (SEQ ID NO:30) 

AGGACCACAGGACACGCAGA (SEQ ID NO:31) 
30 CATAGAACAGTCCAGAACAC (SEQ ID NO:32) 

TTAGCTTGGCACGGCTGTCCAAGGA (SEQ ID NO:33) 

ACAGAATTCGCCCCGGCCTGGTACAC (SEQ ID NO:34) 

TTGAAACTGGAACTCTGAGAAGG (SEQ ID NO:35) 

TGGTGGATGGTGTGAAGCA (SEQ ID NO:36) 
35 CCTTTCTCCAACTTCTTCTCCATTTCCACC (SEQ ID NO:37) 
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GGGGATCATGTCGTCAATGGACT (SEQ ID NO:38) 
• ATGCCCTGTAGGTTCAATGG (SEQ ID NO:39) 
TGGAGGTCTTTAGGGGCTTG (SEQ ID NO:40) 
GGCTGGTCCCCGTCTTCTCCTTCC (SEQ ID NO:41 ) 
5 TCTCTGTTGCCACTTCAGCCTC (SEQ ID NO:42) 

GTCCTGCCCTCAGCAAAGAGAA (SEQ ID NO:43) 
TTCTCCTGCGATTAAAGGCTGT (SEQ ID NO:44) 
ATCCTGTCCCTACTGGCCATTC (SEQ ID NO:45) 
TGTGGACGTGACAGTGAGAAAT (SEQ ID NO:46) 
10 TGGAGTGCTATGGCACGATCTCT (SEQ ID NO:47) 

CCATGGGCATCAAATTCCTGGGA (SEQ ID NO:48) 
CAGACCTGGCTGATTTTTGTAT (SEQ ID NO:49) 
TCATCCAGGTTGTAGATGCCA (SEQ ID NO:50) 
AGGCTCAACAAGGAAAAATGC (SEQ ID NO:51) 
15 GCTAGACAGTCAAGGAGGGACG (SEQ ID NO:52) 

AAAGGGTGGGTGTGGGAGACATTGG (SEQ ID NO:53) 
AAACCAACCTAGGCACCCCAAA (SEQ ID NO:54) 
CAGTGTCCAAAGAGCACC (SEQ ID NO:55) 
CTACCCCTTTAGCGACC (SEQ ID NO:66) 
20 ' TCCTGCCCCCAGAGCGTCACC(SEQIDNO:57) 

GTACGGTCCACATAATTTTGGAGGA (SEQ ID NO:S8) 
CG ACGAACTTCTCTG AAG CGAA (SEQ ID NO:59) 
AGCGACACGGGCATCTGG (SEQ ID NO:60) 
ATGAGCGTCCACCTCCTGAACC (SEQ ID N061) 
25 AGGCAGCAGCATCGTCATCCCC (SEQ ID NO:62) 

TGCATAGCTAGGTCCTGC (SEQ ID NO:63) 

AAGTGACRAAACTAGCTCTATGGGGTGGTGCCGCA (SEQ ID NO:64) 

CTGGCTCTGAAACTTACTAGCCC (SEQ ID NO:65) 

GCTGGACTGTCACCGCATG (SEQ ID NO:66) 
30 GGAGCAGGGTTGGCGTG (SEQ ID NO:67) 

TGCCCTCCGAGAGGTAAGGCCT (SEQ ID NO:68) 

CCCTCCCGGAGGTAAGGCCTC (SEQ ID NO:69) 

GATCAAAGAGACAGACGAGC (SEQ ID NO:70) 

GAAGCCCAGGAAATGC (SEQ ID NO:71) 
35 GGACGCCCACCTGGCCAACC (SEQ ID NO:72) 
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CGTGCTGCCCAACGAAGTG (SEQ ID NO;73) 

' 31 . The primer or probe according to any of claims 28 or 29, wherela the probe Is 
operably linked to at least one label, such as operably linked to two different la- 



32. The probe according to claim 30, wherein the label is selected from TEX, TET, 
TAM. ROX, R6G, ORG, HEX. FLU. FAM. DABSYL. Cy7, Cy5. Cy3. BOFL. BOF, 
BO-X, BO-TRX, BO-TMR, JOE, 6JOE, VIC, 6FAM, LCRed640, LCRed705, 



33. The primer or probe according to any of claims 28-31 , wherein the primer or 
probe Is operably linked to a surface. 

15 34. The primer or probe according to claim 32, wherein the surface is the surface of 
microbeads or a DN A chip. 

35. An antibody directed to an epitope of a RAJ gene product 

20 36. A kit tor use in a method as defined in any of the claims above, comprising at 
least one primer or probe, said probe being as defined in any of claims 29-35, 
and optionally further amplifying means tor nucleic acid amplification. 



5 



bels. 



10 



TAMRA, Blotin, Digoxigenin, DuO-family, Daq-family. 
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